Open solstag opened 8 years ago
Me too encounters the
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
error, with the query:
getpapers --api 'arxiv' --query "cat:math.MP" --outdir "math.MP" -p
during the last step of fetching results:
info: Searching using arxiv API
info: Found 49619 results
Retrieving results [=====-------------------------] 16% (eta 1101.6s)error: Malformed response from arXiv API - no data in feed
info: Retrying failed request
Retrieving results [======------------------------] 20% (eta 1097.4s)error: Malformed response from arXiv API - no data in feed
info: Retrying failed request
Retrieving results [=======-----------------------] 23% (eta 1108.8s)error: Malformed response from arXiv API - no data in feed
info: Retrying failed request
...
Retrieving results [==============================] 100% (eta 4.2s)error: Malformed response from arXiv API - no data in feed
info: Retrying failed request
Retrieving results [==============================] 100% (eta 0.0s)
info: Done collecting results
info: Saving result metadata
<--- Last few GCs --->
3320779 ms: Mark-sweep 670.2 (716.6) -> 670.2 (716.6) MB, 7699.5 / 0.0 ms [allocation failure] [scavenge might not succeed].
3326769 ms: Mark-sweep 670.2 (716.6) -> 670.2 (716.6) MB, 5990.1 / 0.0 ms [allocation failure] [scavenge might not succeed].
3334178 ms: Mark-sweep 670.2 (716.6) -> 672.2 (708.6) MB, 7409.1 / 0.0 ms [last resort gc].
3341032 ms: Mark-sweep 672.2 (708.6) -> 674.2 (708.6) MB, 6852.8 / 0.0 ms [last resort gc].
<--- JS stacktrace --->
==== JS stack trace =========================================
Security context: 0x5cf7b815 <JS Object>
1: JSONSerialize(aka JSONSerialize) [native json.js:~141] [pc=0x294c0904] (this=0x5cf081d9 <undefined>,Q=0x5f1f9381 <String[1]: 0>,u=0x9d5dadc9 <JS Array[1]>,F=0x5cf08101 <null>,G=0x8c95bfa1 <a Stack with map 0xb3309995>,H=0x2d23084d <String[6]: >,I=0x8c95bf91 <String[2]: >)
2: SerializeArray(aka SerializeArray) [native json.js:~69] [pc=0x294c2233] (this=0x5cf081d9 <undefined>,E=0x9...
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
1: node::Abort() [node]
2: 0x80b701ef [node]
3: v8::Utils::ReportApiFailure(char const*, char const*) [node]
4: v8::internal::V8::FatalProcessOutOfMemory(char const*, bool) [node]
5: v8::internal::Heap::FatalProcessOutOfMemory(char const*, bool) [node]
6: v8::internal::Factory::NewRawTwoByteString(int, v8::internal::PretenureFlag) [node]
7: v8::internal::Runtime_QuoteJSONString(int, v8::internal::Object**, v8::internal::Isolate*) [node]
...
49: 0x806c2e90 [node]
50: v8::internal::Execution::Call(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) [node]
51: v8::Function::Call(v8::Local<v8::Context>, v8::Local<v8::Value>, int, v8::Local<v8::Value>*) [node]
52: v8::Function::Call(v8::Local<v8::Value>, int, v8::Local<v8::Value>*) [node]
53: node::AsyncWrap::MakeCallback(v8::Local<v8::Function>, int, v8::Local<v8::Value>*) [node]
54: node::StreamBase::EmitData(int, v8::Local<v8::Object>, v8::Local<v8::Object>) [node]
55: node::StreamWrap::OnReadImpl(int, uv_buf_t const*, uv_handle_type, void*) [node]
56: node::StreamWrap::OnReadCommon(uv_stream_s*, int, uv_buf_t const*, uv_handle_type) [node]
57: node::StreamWrap::OnRead(uv_stream_s*, int, uv_buf_t const*) [node]
58: 0xb7655503 [/usr/lib/libuv.so.1]
59: 0xb7655f1a [/usr/lib/libuv.so.1]
60: uv__io_poll [/usr/lib/libuv.so.1]
61: uv_run [/usr/lib/libuv.so.1]
62: node::Start(int, char**) [node]
63: main [node]
64: __libc_start_main [/lib/libc.so.6]
65: 0x803582d3 [node]
Cancelled
Out-of-memory error for just 50,000 results? C'mon...
You have to pass the --max_old_space_size
option (and possibly others...) to node
. To this end, I tried changing
#!/usr/bin/env node
to
#!/usr/bin/env node --max_old_space_size=896
in the top of
/usr/bin/getpapers
but this just causes getpapers
to wait indefinitely after invocation - and nothing happens!
The only thing that worked was
That is, you have to use the following magic invocation for your large query:
node --max_old_space_size=1400 --optimize_for_size --max_executable_size=1400 --stack_size=1400 /usr/bin/getpapers --api 'arxiv' --query "cat:math.MP" --outdir "math.MP" -p
:warning: You MUST use the full path to getpapers!
:warning: Using a too large value (say, 2000, instead of 1400) on a 32-bit system will cause nothing but a segmentation-fault - you have therefore to fine-tune it on your system. On 64-bit systems, larger values like 4096 may be perfectly adequate.
:red_circle: It would be very desirable to have a way to pass those options more comfortably, rather than having to type them on the command-line and change the invocation from 'getpapers ....'
to 'node .... /usr/bin/getpapers ...'.
These are errors I'm getting. I am not downloading papers, just getting the metadata. I am running from a 12 core server with 40GB RAM connected to the academic network of Paris. I am running these queries through SSH from my computer at home and that connection remains fine throughout them.
JavaScript heap out of memory, 35k results
JavaScript heap out of memory, 10k results
Timeout
Empty query