TritonDataCenter / node-manta-sync

Rsync style command for Joyent's Manta
31 stars 9 forks source link

out of memory error #27

Closed metsuke0 closed 7 years ago

metsuke0 commented 7 years ago

When attempting to upload ~1.5TB to Joyent's Manta cloud, the process errors.

Here is the output from console:

building source file list...

<--- Last few GCs --->

53469 ms: Mark-sweep 1360.1 (1419.6) -> 1360.1 (1435.6) MB, 1253.8 / 1.4 ms [allocation failure] [GC in old space requested]. 54721 ms: Mark-sweep 1360.1 (1435.6) -> 1360.1 (1435.6) MB, 1252.0 / 1.4 ms [allocation failure] [GC in old space requested]. 55986 ms: Mark-sweep 1360.1 (1435.6) -> 1367.2 (1419.6) MB, 1264.6 / 1.4 ms [last resort gc]. 57246 ms: Mark-sweep 1367.2 (1419.6) -> 1374.3 (1419.6) MB, 1260.4 / 1.4 ms [last resort gc].

<--- JS stacktrace --->

==== JS stack trace =========================================

Security context: 0x9322c0cfb51 1: InnerArrayForEach(aka InnerArrayForEach) [native array.js:~935] [pc=0x284f4db6f1cc] (this=0x9322c004381 ,bq=0x3eee02e41159 <JS Function (SharedFunctionInfo 0xc6e70c12a81)>,br=0x9322c004381 ,w=0x3eee02e40a91 <JS Array[41]>,x=41) 2: forEach [native array.js:~954] [pc=0x284f4dba430d] (this=0x3eee02e40a91 <JS Array[41]>,bq=0x3eee02e41159 <JS Function (SharedFunctionI...

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory 1: node::Abort(void) [/usr/local/bin/node] 2: node::DLOpen(v8::FunctionCallbackInfo const&) [/usr/local/bin/node] 3: v8::internal::V8::FatalProcessOutOfMemory(char const, bool) [/usr/local/bin/node] 4: v8::internal::Factory::NewRawOneByteString(int, v8::internal::PretenureFlag) [/usr/local/bin/node] 5: bool v8::internal::Factory::NumberToString(v8::internal::Handle() [/usr/local/bin/node] 6: v8::internal::Object::ToString(v8::internal::Isolate, v8::internal::Handle) [/usr/local/bin/node] 7: v8::internal::Object::ConvertToName(v8::internal::Isolate*, v8::internal::Handle) [/usr/local/bin/node] 8: v8::internal::Runtime_HasProperty(int, v8::internal::Object*, v8::internal::Isolate) [/usr/local/bin/node] Abort trap (core dumped)

Here is the dump file: https://drive.google.com/file/d/0ByNezrn_88ddem5rNWJCMDg5bE0/view?usp=sharing

bahamas10 commented 7 years ago

IRC logs

2016-11-09 12:45:00     -->     bahamas10 (~bahamas10@cpe-98-5-18-96.buffalo.res.rr.com) has joined #manta
2016-11-09 12:46:09     metsuke my node.core is 1.8G
2016-11-09 13:05:51     metsuke here's the issue I submitted: https://github.com/bahamas10/node-manta-sync/issues/27
2016-11-09 13:23:26     bahamas10       metsuke: i'm checking it out now
2016-11-09 13:24:04     metsuke awesome, thanks!
2016-11-09 13:24:55     bahamas10       np. my initial thought is that the simple act of storing an array with every source file and dest file name (as strings) is exhausting the memory
2016-11-09 13:27:48     bahamas10       metsuke: how many files are there total? I know it's 1.5T but i'm thinking the number of actual files is the problem here
2016-11-09 13:28:02     bahamas10       try $ find . -type f | wc -l, where . is the dir you're trying to upload
2016-11-09 13:32:09     metsuke 777303
2016-11-09 13:32:38     bahamas10       great thanks, i'll do some testing locally... i'm also pulling the core dump now as we speak to analyze
2016-11-09 13:32:55     metsuke thank you!
2016-11-09 13:35:22     bahamas10       also, can you try running with `-v` and then perhaps `-vv` it increase the verbosity? this may shed some light on the issue
2016-11-09 13:43:57     metsuke bahamas10: https://gist.github.com/metsuke0/9d1c53bd6141c1b84922c3ce9e9459e6
2016-11-09 13:44:29     metsuke I get some errors in the beginning, but it doesn't seem affect the rest of it.  I was able to do smaller uploads fine
2016-11-09 13:45:40     bahamas10       yeah i've seen those dtrace errors before, they shouldn't be affecting manta-sync
2016-11-09 13:46:04     bahamas10       also, what version of manta-sync are you using? `manta-sync -V` will tell you
2016-11-09 13:46:30     metsuke 0.4.1
2016-11-09 13:47:06     bahamas10       ok cool, that's the latest then
2016-11-09 13:48:19     bahamas10       so my latest theory right now is number of subdirectories... i'm wondering if i have some bad logic somewhere in `finder.js` to find all files recursively
2016-11-09 13:48:33     bahamas10       i'm attempting to recreate your issue by creating a large number of local files to sync
2016-11-09 14:49:26     bahamas10       so just an update: it's failing while building the source file list.  manta-sync used to work by just gathering a list of files (an array of string filenames), but it has since changed (with the introduction of --reverse) to create `LocalFile` and `MantaFile` objects depending on direction of sync
2016-11-09 14:49:46     bahamas10       the source gathering logic for local files (non-reverse sync) is here https://github.com/bahamas10/node-manta-sync/blob/master/lib/localfile.js#L108
2016-11-09 14:50:22     bahamas10       I haven't successfully recreated the issue yet, but i'm working on a couple test programs to try and isolate the issue if metsuke you wouldn't mind running them
2016-11-09 14:51:25     bahamas10       i'm busy with other priorities atm, but will have more time tonight to isolate this issue.  for the time being, what operating system and what version of node are you using? sorry if you've stated that before, I joined the IRC when you said "my node.core is 1.8G"
2016-11-09 14:53:29     bahamas10       right now, for every file found, a LocalFile and MantaFile object are created and held in an array... for a large number of files this might be the source of the memory bloat.  My current theory is that, by only storing a string of the filename, and creating the LocalFile and MantaFile objects on demand when needed, this could reduce memory bloat as the GC could eliminate these objects when they fall out of
2016-11-09 14:53:29     bahamas10       scope
2016-11-09 14:55:31     metsuke bahamas10: OS - 11.0-RELEASE-p3                node - v6.9.1
2016-11-09 14:56:34     metsuke I can run anything on this server.  It was set up to test the  manta sync
2016-11-09 15:47:47     metsuke bahamas10: sorry, I forgot to add that the OS is FreeBSD
2016-11-09 16:54:38     <--     metsuke (264b0c63@gateway/web/cgi-irc/kiwiirc.com/ip.38.75.12.99) has quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client)
2016-11-09 16:57:52     -->     metsuke (adc43deb@gateway/web/cgi-irc/kiwiirc.com/ip.173.196.61.235) has joined #manta
2016-11-09 16:58:17     <--     metsuke (adc43deb@gateway/web/cgi-irc/kiwiirc.com/ip.173.196.61.235) has quit (Client Quit)
2016-11-09 16:58:29     -->     metsuke (adc43deb@gateway/web/cgi-irc/kiwiirc.com/ip.173.196.61.235) has joined #manta
2016-11-09 17:03:13     <--     metsuke (adc43deb@gateway/web/cgi-irc/kiwiirc.com/ip.173.196.61.235) has quit (Client Quit)
2016-11-09 17:03:49     -->     metsuke (adc43deb@gateway/web/cgi-irc/kiwiirc.com/ip.173.196.61.235) has joined #manta
2016-11-09 17:12:19     <--     metsuke (adc43deb@gateway/web/cgi-irc/kiwiirc.com/ip.173.196.61.235) has quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client)
2016-11-09 17:15:01     -->     metsuke (adc43deb@gateway/web/cgi-irc/kiwiirc.com/ip.173.196.61.235) has joined #manta
2016-11-09 17:20:15     <--     metsuke (adc43deb@gateway/web/cgi-irc/kiwiirc.com/ip.173.196.61.235) has quit (Quit: http://www.kiwiirc.com/ - A hand crafted IRC client)
2016-11-09 17:20:45     -->     metsuke (264b0c63@gateway/web/cgi-irc/kiwiirc.com/ip.38.75.12.99) has joined #manta
2016-11-10 12:17:28     bahamas10       metsuke: great thanks, i also have a freebsd box i did some testing on last night - i will have a patch today i'd like you to test that *may* fix the issue if my theory is correct.  i'll let you know
2016-11-10 12:20:10     metsuke bahamas10: great, I'll be ready, thanks
2016-11-10 13:48:19     bahamas10       metsuke: i've put a new commit into the `metsuke` branch on github https://github.com/bahamas10/node-manta-sync/tree/metsuke
2016-11-10 13:48:46     bahamas10       if you can, clone the repo and checkout that branch, run `npm install`, and then run manta-sync locally with `./manta-sync`
2016-11-10 13:49:14     bahamas10       i've done testing on this branch and it all seems to work, but because i'm not able to reprouce what you are seeing i'm only operating under a theory currently
2016-11-10 13:49:52     bahamas10       but, since the bug you encounter happens before any PUT or DELETE operations are done, i'd recommend running your test on this branch with --dry-run 
2016-11-10 14:59:49     metsuke bahamas10: looks promising so far!  It already got way further than before.  Once it is done uploading in a few hours I'll let you know how it went
2016-11-10 16:36:43     bahamas10       metsuke: awesome! glad to hear it! 
2016-11-11 10:54:34     metsuke @bahamas10, it looks like the process stopped about a third of the way through uploading the files: manta-sync: AssertionError: undefined (object) is required
2016-11-11 10:54:44     metsuke I will try again though
2016-11-11 12:22:07     bahamas10       metsuke: well, at least it got past the initial file listing step - that's progress! 
2016-11-11 12:22:34     bahamas10       but, that error message is fairly lame.
2016-11-11 12:30:33     metsuke it's already done with 70K files out of 550K left, so it is still working much better than before.  I'll report back once anything else happens, thanks
2016-11-11 12:31:28     bahamas10       metsuke: great. the patch in the `metsuke` branch definitely fixes the initial file listing memory exhausting bug... i'm just not sure why it would fail half way through like that
2016-11-11 12:32:10     bahamas10       my initial thought would be the local file list becoming out of sync with the files after the list is created - are the files you are uploading still being modified?
2016-11-11 12:33:07     metsuke No, I didn't sync the files once I started trying to upload, so they should be the same
2016-11-11 12:35:04     bahamas10       ok.  i'll try to come up with a patch to make error messages more verbose - having the filename and line number associated with that assertion is a must
2016-11-11 13:13:53     metsuke hm, ran into it again after about 100K files
2016-11-11 14:31:57     bahamas10       metsuke: the stack from the error message is removed as a result of the manta client creation here https://github.com/joyent/node-manta/blob/master/lib/create_client.js#L243-L247
2016-11-11 14:32:10     bahamas10       try running your script with the DEBUG env variable set to 1
2016-11-11 14:32:16     bahamas10       ie. DEBUG=1 ./manta-sync ...
2016-11-11 14:32:41     bahamas10       that way, when that error is encountered we will have a stacktrace to work with
2016-11-11 14:37:25     rmustacc        bahamas10: Presumably you could go from the core, no?
2016-11-11 14:41:53     bahamas10       metsuke: i don't know how to read it.  mdb and gdb don't seem to like it, and i know very little about what freebsd dumps
2016-11-11 14:57:15     metsuke I'm running with DEBUG=1
2016-11-11 14:57:50     bahamas10       rmustacc i tagged metsuke accidentely, that last message was meant for you.
2016-11-11 14:58:07     bahamas10       metsuke: great thanks; assuming it fails again we should know more
2016-11-11 16:10:23     metsuke bahamas10: https://gist.github.com/metsuke0/733726b995cee5619d2a240843b561e3
2016-11-11 16:10:28     metsuke that is the error
2016-11-11 16:27:25     melloc  metsuke: Looks like it's this: https://github.com/joyent/node-manta/issues/261
2016-11-11 16:27:43     melloc  It was fixed in a recent version of node-manta.
2016-11-11 16:29:42     melloc  The release that contains the fix is 3.1.1.
2016-11-11 16:29:59     melloc  But manta-sync depends on manta ~3.0.0.
2016-11-11 16:30:53     melloc  Although that function is called when there's a read error, anyways.
2016-11-11 16:32:56     metsuke melloc: makes some sense then: https://gist.github.com/metsuke0/e4cff10bee26ef60f670babcd9205545
2016-11-11 16:35:02     melloc  If you remove that manta@3.0.0, it should use the manta@3.1.3. You can then run it again, and hopefully get the other error it was trying to report.
2016-11-11 16:35:35     melloc  (You could also bump the dependency in the package.json in the manta-sync folder and run npm install in that directory to pull down the newer one.)
2016-11-11 16:37:03     metsuke that sounds better, I chose the latter
bahamas10 commented 7 years ago

please feel free to reopen if the issue persists in or after v0.4.2