pradeepgn / s3fs

Automatically exported from code.google.com/p/s3fs
GNU General Public License v2.0
0 stars 0 forks source link

Upload multiple files concurrently #395

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Detailed description of observed behavior:

If multiple processes are writing files to s3fs, the files appear to upload one 
at a time.

What steps will reproduce the problem - please be very specific and
detailed. (if the developers cannot reproduce the issue, then it is
unlikely a fix will be found)?

1. Mount an s3fs at /mnt/s3drive

2. Copy 5 files at the same time, use the & operator to run all the commands at 
the same time:
time cp test1.txt /mnt/s3drive/ &
time cp test2.txt /mnt/s3drive/ &
time cp test3.txt /mnt/s3drive/ &
time cp test4.txt /mnt/s3drive/ &
time cp test5.txt /mnt/s3drive/ &

The output will show that each copy waited for the previous copy to complete 
before starting.

I am hitting this problem in a web application. After a user uploads a file I 
transfer it to s3 through s3fs. When many users are uploading files, the web 
application slows down because each copy to s3fs gets slower and slower while 
it waits for previous copies to complete.

===================================================================
The following information is very important in order to help us to help
you.  Omission of the following details may delay your support request or
receive no attention at all.
===================================================================
Version of s3fs being used (s3fs --version):
1.74

Version of fuse being used (pkg-config --modversion fuse):

System information (uname -a):
Linux 3.11.7-200.fc19.x86_64 #1 SMP Mon Nov 4 14:09:03 UTC 2013 x86_64 x86_64 
x86_64 GNU/Linux

Distro (cat /etc/issue):
Fedora release 19 (Schrödinger’s Cat)

s3fs command line used (if applicable):

/etc/fstab entry (if applicable):

s3fs syslog messages (grep s3fs /var/log/syslog):

Original issue reported on code.google.com by penguin...@gmail.com on 6 Dec 2013 at 10:23

GoogleCodeExporter commented 8 years ago
Hi,

I tested simply this case, but the result seems that s3fs uploaded parallel by 
each process.
I would like to know how many uploading at same time on web server.

The single process uploading is multiple parallel uploading from s3fs to 
S3(default is 5 parallel).
Then many process uploading same time means by five times(parallel count = 
process count * 5).

I would like to know more information for solving this issue.

Thanks in advance for your help.

Original comment by ggta...@gmail.com on 10 Dec 2013 at 2:24

GoogleCodeExporter commented 8 years ago
Here is a test I did to verify this.

First I run this command to copy one file. I use "time" command to time the 
copy.

time cp test.txt /mnt/s3drive/test1.txt

And get output:

real    0m3.869s
user    0m0.000s
sys     0m0.004s

This shows it took 3.8 seconds to complete the copy of one file. Next, I run 5 
copy commands at the same time with this command. Use use & operator to run all 
the commands at the same time:

time cp test.txt /mnt/s3drive/test1.txt & time cp test.txt 
/mnt/s3drive/test2.txt & time cp test.txt /mnt/s3drive/test3.txt & time cp 
test.txt /mnt/s3drive/test4.txt & time cp test.txt /mnt/s3drive/test5.txt &

And get output:

real    0m3.892s
user    0m0.000s
sys     0m0.004s

real    0m7.670s
user    0m0.000s
sys     0m0.004s

real    0m11.215s
user    0m0.002s
sys     0m0.002s

real    0m14.797s
user    0m0.001s
sys     0m0.003s

real    0m18.295s
user    0m0.002s
sys     0m0.002s

As each copy command completes it shows the time. If everything is uploading at 
the same time each command should still take about 3.8 seconds and they should 
all complete at about the same time, but instead each copy command shows slower 
and slower. The times are about 3.8 seconds apart, so it looks like it is 
uploading the first file in 3.8 seconds, waiting, and then uploading the second 
file, etc.

Original comment by penguin...@gmail.com on 13 Dec 2013 at 9:28