varun2784 / weed-fs

Automatically exported from code.google.com/p/weed-fs
0 stars 0 forks source link

Memory consumes too much when uploading big files. #39

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. dd if=/dev/zero of=zero.trash.500m bs=1024000 count=500
2. weed upload zero.trash.500m
3. see the memory taken by weed(eg: ps aux | grep -i weed) is more than 100M.

platform: debian-7 x86-amd64. (8g ram)

What is the expected output? What do you see instead?
 expect: The memory should just take the size of meta-info.
 instead: "weed"-volume process eats incredible memory. 
 (Or are you using some sort of mmap? )

What version of the product are you using? On what operating system?
 newest currently. v0.37

Please provide any additional information below.

 1. here is the infomation before uploading a big file:
 -----------------------------
root@debian:/var/lib/weed# ps aux | grep -i weed
root     22036  0.0  0.0 339692  7044 pts/21   Sl+  Aug05   0:01 weed master 
-mdir master
root     23300  0.0  0.1 340996 10620 pts/22   Sl+  00:24   0:00 weed volume 
-max=100 -mserver=localhost:9333 -dir=volumn-data-1 -debug=true
root     23356  0.0  0.0   7764   864 pts/24   D+   00:25   0:00 grep -i weed
 ---------------------------

 2. now upload once and check. We can find that memory rises to 12.8% of 8g-ram (about 1037228B)
 --------------------------------------
root@debian:/var/lib/weed# weed upload zero.trash.500m 
[{"fileName":"zero.trash.500m","fileUrl":"localhost:8080/2,01118d1c58166a","fid"
:"2,01118d1c58166a","size":512000000,"error":""}]
root@debian:/var/lib/weed# ps aux | grep -i weed
root     22036  0.0  0.0 339692  7044 pts/21   Sl+  Aug05   0:01 weed master 
-mdir master
root     23300  1.4 12.8 1470476 1037228 pts/22 Sl+ 00:24   0:02 weed volume 
-max=100 -mserver=localhost:9333 -dir=volumn-data-1 -debug=true
root     23390  0.0  0.0   7764   864 pts/24   S+   00:26   0:00 grep -i weed
----------------------------------------

 3. now upload once and check. We can find that memory rises to 23.9% of 8g-ram (about 1932780B)
 ----------------------------------------
root@debian:/var/lib/weed# weed upload zero.trash.500m 
[{"fileName":"zero.trash.500m","fileUrl":"localhost:8080/4,01118ebd3873c5","fid"
:"4,01118ebd3873c5","size":512000000,"error":""}]
root@debian:/var/lib/weed# ps aux | grep -i weed
root     22036  0.0  0.0 339692  7016 pts/21   Sl+  Aug05   0:01 weed master 
-mdir master
root     23300  2.3 23.9 2445516 1932780 pts/22 Sl+ 00:24   0:04 weed volume 
-max=100 -mserver=localhost:9333 -dir=volumn-data-1 -debug=true
root     23405  0.0  0.0   7764   864 pts/24   S+   00:27   0:00 grep -i weed
root@debian:/var/lib/weed# 
 ----------------------------------------

Original issue reported on code.google.com by darkdark...@gmail.com on 5 Aug 2013 at 4:33

GoogleCodeExporter commented 8 years ago
Although normally we will not post a big file of 500M, but I think 100 images 
of 5M raw image is a common case.

I also the option with -debug=false, the result is the same.

Original comment by darkdark...@gmail.com on 5 Aug 2013 at 4:42

GoogleCodeExporter commented 8 years ago
Or maybe the cache? 
It seems that Haystack will cache the file just written.
I am not sure.

Original comment by darkdark...@gmail.com on 5 Aug 2013 at 4:53

GoogleCodeExporter commented 8 years ago
When reading and writing, the file content are read into memory. So if you are 
just reading one or two big files, the memory will go up. If you read or write 
100 5M files, the memory will not go wild. Please confirm whether this helps 
your use case.

Original comment by chris...@gmail.com on 5 Aug 2013 at 5:08

GoogleCodeExporter commented 8 years ago
Thanks very much.

re-test:

1. if just 100 5M files, yes, the momory is not going crazy.

2. if upload 500M file 4 times, the memory will go crazy, but it will drop to 
normal after about 10 minutes. (Are there some tricks?)

Since I can control the size of file uploading, weed-fs fits my env. 

Thanks.

Original comment by darkdark...@gmail.com on 6 Aug 2013 at 3:29

GoogleCodeExporter commented 8 years ago
Thanks for confirmation! WeedFS was not created for super large files, but for 
many small files.

Memory garbage collection is managed by Go itself.

If serving large files is a target and Go memory GC can not keep up, we can 
manage the memory by code. But seems this use case does not need it yet.

Original comment by chris...@gmail.com on 6 Aug 2013 at 7:23