Closed GoogleCodeExporter closed 8 years ago
This is known and documented limitation of newer versions of s3fs that use
multipart uploads. The fix is probably easy, but some investigation is needed
for large file support in the C code. Current data types have a 2^31 (2GB)
limit for doing the math that is involved. Admittedly, since I didn't
personally have the need for this, I didn't take the time to investigate it.
In the meantime, you have a couple of options:
- use an older version of s3fs (beware, other since fixed bugs may be lurking)
- split your files into parts < 2GB (not an attractive solution, but it should work)
Since this is open source, others certainly can look into this and submit a
patch. Like I said, this might be an easy fix.
Original comment by dmoore4...@gmail.com
on 11 Jan 2011 at 7:01
This is a shame. s3fs has went from supporting the amazon file size limit (5GB)
to imposing it's own file size limit. Unfortunately I am not a C++ programmer
or I would take a stab at this.
Splitting files is not an option for me. Do you mean that there are older
versions of s3fs that support multipart but dont have the 2GB file size bug? Do
you know which version this would be so I can test this out?
Original comment by mjbuc...@gmail.com
on 12 Jan 2011 at 8:52
mjbuchan, here's a patch to treat all files, regardless of size, the old way
without using multipart upload for any file what-so-ever. This comes without
any support.
===================================================================
--- src/s3fs.cpp (revision 300)
+++ src/s3fs.cpp (working copy)
@@ -1981,6 +1981,9 @@
// If file is > 20MB, then multipart will kick in
/////////////////////////////////////////////////////////////
+ result = put_local_fd_small_file(path, meta, fd);
+ return result;
+
if(st.st_size > 2147483647) { // 2GB - 1
// close f ?
return -ENOTSUP;
Original comment by dmoore4...@gmail.com
on 13 Jan 2011 at 3:50
As I suspected, making this change is relatively easy -- bumping the max file
size limit to 64GB -- need to finish testing before release.
Original comment by dmoore4...@gmail.com
on 20 Jan 2011 at 9:22
Tested on a large EC2 instance (Ubuntu 10.10), works as expected:
$ rsync -av --progress --stats --whole-file 3GB.bin misc.suncup.org/
sending incremental file list
3GB.bin
3145728000 100% 27.39MB/s 0:01:49 (xfer#1, to-check=0/1)
Number of files: 1
Number of files transferred: 1
Total file size: 3145728000 bytes
Total transferred file size: 3145728000 bytes
Literal data: 3145728000 bytes
Matched data: 0 bytes
File list size: 42
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 3146112090
Total bytes received: 31
sent 3146112090 bytes received 31 bytes 3601731.11 bytes/sec
total size is 3145728000 speedup is 1.00
Original comment by dmoore4...@gmail.com
on 21 Jan 2011 at 4:48
Resolved with 1.35
Original comment by dmoore4...@gmail.com
on 21 Jan 2011 at 5:19
I have been testing this over the past few days. All works as expected.
Fantastic!
Out of interest, is 64GB another hard limit or could this be increased at some
point?
Thanks again for the continued work on this project.
Original comment by mjbuc...@gmail.com
on 26 Jan 2011 at 8:42
When a file is greater than 20MB, the multipart upload kicks in. I choose to
make the multipart upload chunks 10MB. AWS limits a the number of parts in a
multipart upload to 10,000. So theoretically, I would need to change one
number in the source code to move the file size limit from 64GB (my limit since
I like nice "round" numbers, this is 2 to the 36th power) to 100GB.
AWS's limit is somewhere in the 1TB range. In order to get past the 100GB,
then the chunk size would need to be adjusted (and probably timeout values and
such).
I can tell you this, if I ever implement this, I will never test it, so unless
I can get someone to collaborate on this (to do the testing), it probably won't
get done by me. I do not like releasing untested code.
If you feel you have the need for this, please open a new enhancement issue for
tracking.
Original comment by dmoore4...@gmail.com
on 29 Jan 2011 at 12:42
Hi,
I have the last version, but I got the following when I try to copy a file
larger than 2GB:
cp: writing `/s3bucket/biodata/genome.fa': No space left on device
cp: closing `/s3bucket/biodata/genome.fa': Input/output error
/var/log/message:
May 10 06:06:29 ascidea s3fs: 2587###result=-28
May 10 06:06:29 ascidea s3fs: 2587###result=-28
May 10 06:06:29 ascidea s3fs: 948 ### bytesWritten:0 does not match
lBufferSize: 10485760
Any idea. I need to have big files there, and I have no idea how to do it.
thanks
Original comment by lorena.p...@gmail.com
on 10 May 2012 at 10:21
Original issue reported on code.google.com by
mjbuc...@gmail.com
on 11 Jan 2011 at 10:35