googlearchive / flashlight

A pluggable integration with ElasticSearch to provide advanced content searches in Firebase.
http://firebase.github.io/flashlight/
756 stars 144 forks source link

JavaScript heap out of memory. #159

Closed kleeb closed 7 years ago

kleeb commented 7 years ago

When the amount of data is big (like lets say 100K records, that are being cached at start) flashlight crashes:

<--- Last few GCs --->

[56821:0x2d3d160]   284038 ms: Mark-sweep 1431.3 (1645.0) -> 1431.3 (1645.0) MB, 4214.9 / 0.0 ms  allocation failure GC in old space requested
[56821:0x2d3d160]   287951 ms: Mark-sweep 1431.3 (1645.0) -> 1431.3 (1614.0) MB, 3908.1 / 0.0 ms  (+ 1.0 ms in 1 steps since start of marking, biggest step 1.0 ms) last resort gc 
[56821:0x2d3d160]   291389 ms: Mark-sweep 1431.3 (1614.0) -> 1431.3 (1614.0) MB, 3437.7 / 0.0 ms  last resort gc 

<--- JS stacktrace --->

==== JS stack trace =========================================

Security context: 0xe3ad0bc0d31 <JS Object>
    1: baseCopy [/home/node/flashlight/node_modules/lodash/index.js:~1631] [pc=0x2368982afedf](this=0xe3ad0bd2451 <JS Global Object>,source=0x375c40042a89 <an Object with map 0x1a970ad068d9>,props=0x375c40042ba1 <JS Array[4]>,object=0x375c40042b39 <an Object with map 0x19a190507011>)
    2: baseAssign [/home/node/flashlight/node_modules/lodash/index.js:1591] [pc=0x23689844c2bb](this=0xe3ad0bd245...

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
 1: node::Abort() [node]
 2: 0x125f13c [node]
 3: v8::Utils::ReportOOMFailure(char const*, bool) [node]
 4: v8::internal::V8::FatalProcessOutOfMemory(char const*, bool) [node]
 5: v8::internal::Factory::NewTransitionArray(int) [node]
 6: v8::internal::TransitionArray::Insert(v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::Name>, v8::internal::Handle<v8::internal::Map>, v8::internal::SimpleTransitionFlag) [node]
 7: v8::internal::Map::CopyReplaceDescriptors(v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::DescriptorArray>, v8::internal::Handle<v8::internal::LayoutDescriptor>, v8::internal::TransitionFlag, v8::internal::MaybeHandle<v8::internal::Name>, char const*, v8::internal::SimpleTransitionFlag) [node]
 8: v8::internal::Map::CopyAddDescriptor(v8::internal::Handle<v8::internal::Map>, v8::internal::Descriptor*, v8::internal::TransitionFlag) [node]
 9: v8::internal::Map::CopyWithField(v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::Name>, v8::internal::Handle<v8::internal::FieldType>, v8::internal::PropertyAttributes, v8::internal::Representation, v8::internal::TransitionFlag) [node]
10: v8::internal::Map::TransitionToDataProperty(v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::Name>, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::internal::Object::StoreFromKeyed) [node]
11: v8::internal::LookupIterator::PrepareTransitionToDataProperty(v8::internal::Handle<v8::internal::JSObject>, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::internal::Object::StoreFromKeyed) [node]
12: v8::internal::StoreIC::LookupForWrite(v8::internal::LookupIterator*, v8::internal::Handle<v8::internal::Object>, v8::internal::Object::StoreFromKeyed) [node]
13: v8::internal::StoreIC::UpdateCaches(v8::internal::LookupIterator*, v8::internal::Handle<v8::internal::Object>, v8::internal::Object::StoreFromKeyed) [node]
14: v8::internal::StoreIC::Store(v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Name>, v8::internal::Handle<v8::internal::Object>, v8::internal::Object::StoreFromKeyed) [node]
15: v8::internal::KeyedStoreIC::Store(v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>) [node]
16: v8::internal::Runtime_KeyedStoreIC_Miss(int, v8::internal::Object**, v8::internal::Isolate*) [node]
17: 0x236897e063a7
kleeb commented 7 years ago

solved it by adding

--max-old-space-size=4096 --stack-size=1968

to the first line of app.js, additionally started using semaphores: http://stackoverflow.com/questions/37456690/how-insure-indexing-every-object-with-firebase-flashlight-in-a-elasticsearch-bon

katowulf commented 7 years ago

Thanks for adding the solution! Is this in reference to indexing a large path in Firebase or retrieving a large number of results from ES on the client?

kleeb commented 7 years ago

indexing large number of results (PathMonitor) when flashlight starts

katowulf commented 7 years ago

Thanks for the update. Note that I added some advice on improving restart efficiency here.

kleeb commented 7 years ago

Those advices are nice, but with refBuilder you are deciding which records to skip. How about if you don't want to skip any records at all, for instance if they are all part of a bigger shop.

katowulf commented 7 years ago

The goal is simply not to load everything into memory at once. It's safe to assume, if your server runs all the time, that you've processed everything older than a day (a week; whatever makes sense) and use a timestamp. Then only load records newer than minusOneDay or similar.

Some assembly required here, since I don't know your use case and goals, but you should be able to come up with a sensible query that doesn't load all the data.

kleeb commented 7 years ago

Lets say I setup refBuilder to cache records added in the last 24 hours. How about the older once? Is there a way to setup some kind of scheduler or rule? Or will they be indexed anyway?

Am I right that refBuilder is used only at the startup?

My case is that I have around 1 million online catalogue items. More or less all need to be added and indexed. The project startup is not a problem, as I can add them periodically, but the whole solution will crash when the server needs a restart for any reason.

kleeb commented 7 years ago

Isn't there a bug in the refBuilder docs? If we want to never look back more than a day during a server restart, shouldn't the condition be return ref.orderByChild('timestamp').startAt(Date.now() - 86400000); ?

katowulf commented 7 years ago

I don't know your DB schema or what you've set up in security rules (indexOn needs to be configured right) but that should work.

Keep in mind that if you are storing 1 million catalog items in a single node, clients are going to have the same issues. You can't load those into memory anywhere effectively, you'll always need to use a query.

If you already have a million in the path, you won't be able to index them all at once using Flashlight. You'll either need to run it a few different times with different queries, or just manually iterate/paginate and index the records by accessing ES directly. Some assembly required.

kleeb commented 7 years ago

Clear enough. Thanks you all info!