s3tools / s3cmd

Official s3cmd repo -- Command line tool for managing S3 compatible storage services (including Amazon S3 and CloudFront).
https://s3tools.org/s3cmd
GNU General Public License v2.0
4.59k stars 905 forks source link

s3cmd no longer streams gzip files correctly #811

Open judioo opened 7 years ago

judioo commented 7 years ago

Since v1.6.0 I'm no longer able to stream gzip files from s3. I used to be able to do this

$ s3cmd --version
s3cmd version 1.5.2
$ s3cmd get s3://<path>/file.jl.gz - |zcat |head -1
{...<data>..}
ERROR: [Errno 32] Broken pipe

Now I get this

$ s3cmd --version
s3cmd version 1.6.1
$ s3cmd get s3://t<path>/file.jl.gz - |zcat | head -1

gzip: stdin: not in gzip format

What changed?

fviard commented 7 years ago

For me it is working with the last master version. Are you sure that the file that you try to get is a valid "gzip" file? I would recommand that you try to get the file: s3cmd get s3://t/file.jl.gz file.jl.gz And try to zcat / open it to check that it is valid.

If it is the case, you can try something like: s3cmd get s3://t/file.jl.gz - | cat to check that the output is something like a "binary" output and not some message.

judioo commented 7 years ago

It's definitely a bug that affects a number of people here. using v1.6.1 I can download the zcat the file without any problems. It is just when streaming we get the issue.

Again it's the same file, same command, same VM, same s3 bucket. Only difference is the version of s3cmd.

%uname -a
Linux foo-VirtualBox 4.4.0-62-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
mdomsch commented 7 years ago

There are a bunch of output() calls sprinkled into the get command now, which introduces logging messages into the stdout stream that should only have file content. I started a branch bug/stdout in the master s3cmd tree a few days ago to try to excise all these. With that branch, using --no-progress --quiet when streaming suppresses the extra logging info and should result in uncompressable content. There's more to be done, but that's a good start.

Thanks, Matt

On Tue, Feb 7, 2017 at 6:45 AM, Judioo notifications@github.com wrote:

It's definitely a bug that affects a number of people here. using v1.6.1 I can download the zcat the file without any problems. It is just when streaming we get the issue.

Again it's the same file, same command, same VM, same s3 bucket. Only difference is the version of s3cmd.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/s3tools/s3cmd/issues/811#issuecomment-277988459, or mute the thread https://github.com/notifications/unsubscribe-auth/AAqDqhWxKU62M4qumcOEnw6fnp17l8ltks5raGdRgaJpZM4LQrJA .

vjorlikowski commented 5 years ago

So...resurrecting this discussion, after a couple of years. I've been affected by this same issue with:

krait:~ vjo$ s3cmd --version s3cmd version 1.6.1

It turns out that the resolution was to change:

progress_meter = True

to:

progress_meter = False

in my ~/.s3cfg

After doing so, I was able to stream a gzip'd tar file from an s3 store:

krait:retrieve_from_s3 vjo$ s3cmd get s3://ArchiveTestBuckets/abac_test.tgz - | tar -xvzf - x abac_test/ x abac_test/yap-6.2.2.tar.gz x abac_test/install/ x abac_test/osx_changes.patch x abac_test/._abac-0.1.5 x abac_test/abac-0.1.5/ x abac_test/vstr-1.0.15.tar.gz x abac_test/._abac-0.1.5.tar x abac_test/abac-0.1.5.tar x abac_test/abac/ x abac_test/abac.patched/ x abac_test/vstr-1.0.15/ x abac_test/strongswan-4.6.4.tar.bz2 x abac_test/yap-6.2.2/ x abac_test/yap-6.2.2/changes-5.0.html x abac_test/yap-6.2.2/README.nt x abac_test/yap-6.2.2/or.cut.o x abac_test/yap-6.2.2/tab.completion.o x abac_test/yap-6.2.2/libYap.a x abac_test/yap-6.2.2/misc/ x abac_test/yap-6.2.2/computils.o

If anyone's still listening out there - give that a try.

If it resolves your problem - then, it "feels like" the resolution to this issue may be processing of arguments in the "get" command (so that disabling the progress meter on the CLI is properly handled, and overriding in ~/.s3cfg is not required).

franzhcs commented 4 years ago

I just tried and you are right - disabling the "progress meter" fixes this issue. Quite odd..

kerenskybr commented 1 year ago

This issue still happen:

gzip: stdin: not in gzip format
Exception ignored in: <encodings.utf_8.StreamWriter object at 0x7fbcbf159850>
BrokenPipeError: [Errno 32] Broken pipe
(venv) root@rescue ~ # s3cmd --version
s3cmd version 2.3.0

@vjorlikowski solution worked. Ty for the workaround