Closed GoogleCodeExporter closed 9 years ago
Is this bucket using encryption and compression, or just compression?
In the later case, this is most likely caused by the same issue responsible for
the HMAC error that you reported on the mailing list a little while ago (i.e.,
defective hardware on your machine or a problems on the backend server). Of
course, it could also be a bug in S3QL though. Is this happening on the same
machine, and/or with the same backend as the HMAC error?
That said, this problem still should not result in a crash, so I'll make sure
to fix that. Thanks for the report!
Original comment by Nikolaus@rath.org
on 15 Oct 2013 at 4:11
This bucket uses both encryption and compression. No, the HMAC error is
happening on a server that has the same patch level as the server in Issue 425
(which is also a different server). The server with this
http.client.ResponseNotReady error is a vanilla 2.4, no patches. Interestingly,
it is a bucket specific error. The server has five buckets mounted, and this
only happens on one (same S3 region as the others).
I made sure not to have any other job running when this bombed, to make sure it
is not a bandwidth issue. The server is on a 100mb line, so that shouldn't be
an issue anyhow. Not sure what to look for :-(
Thanks for the help
Balazs
Original comment by czv...@gmail.com
on 16 Oct 2013 at 10:36
This issue was closed by revision 6b219b840e80.
Original comment by Nikolaus@rath.org
on 19 Oct 2013 at 5:26
The above revision will fix the problem with S3QL crashing when it receives
malformed data. I am, however, still at a loss as what might be causing this
corruption for you in the first place.
If this file system really uses compression and encryption, but you're getting
an error when decompressing, this means that the HMAC of the compressed data
was successfully verified. Therefore, at least this case of corruption cannot
result from problems with the storage service. This leaves either a bug in
S3QL, or problems with the local hardware.
On the other hand, the HMAC error that you reported on the list implies that in
that case the *encrypted* data was corrupted. This would mean that there are at
least two data corruption bugs in S3QL, or that you have faulty hardware.
But then, you also reported this happening on two different systems, which
makes it rather hard to blame on a hardware problem as well.
In other words, I currently do not have a clear plan forward. Maybe the best
strategy is to upgrade all your servers to the newest (soon to be released)
S3QL and collect some more data. Hopefully that will show some sort of pattern
in either corruption type, affected computers, or affected buckets.
Original comment by Nikolaus@rath.org
on 19 Oct 2013 at 9:22
Ok, that sounds like a plan. I am upgrading all involved servers today, and
rerun everything. I will report back. Thanks very much for releasing a new
version, I was starting to lose track of the right order of the patches :-)
Original comment by czv...@gmail.com
on 21 Oct 2013 at 3:35
Original issue reported on code.google.com by
czv...@gmail.com
on 15 Oct 2013 at 1:15