amark / gun

An open source cybersecurity protocol for syncing decentralized graph data.
https://gun.eco/docs
Other
17.96k stars 1.16k forks source link

With large(ish) dataset, map() only returns object stubs #780

Open arthurmilliken opened 5 years ago

arthurmilliken commented 5 years ago

When mapping over ~31k keys, I am getting object stubs with no properties instead of actual objects.

Dataset (radata): radata.zip

const Gun = require('gun');
const gun = Gun();
const gdb = gun.get('gun.biblecurious.org');
const table = gdb.get('editions').get('kjv').get('verses');

let count = 0;
// THIS ONLY RETURNS OBJECT STUBS
table.once().map().once((data, key) => {
    count++;
    console.log('----------------------')
    console.log(count, key, data);
});

output: Screen Shot 2019-07-22 at 5 23 33 PM

amark commented 5 years ago

@arthurmilliken thanks, great meeting you at dWeb camp!

Thanks for helping find these issues, I'll be trying to process them over the next few weeks.

amark commented 4 years ago

@arthurmilliken hmm, I'm some of my testing, I've noticed that it can take Heroku ~3 seconds to reply with "data not found", and GUN will often "check if this exists before writing", which if you are importing thousands of records... could explain 1 of the slowdowns that I was unaware of.

Do things go any faster when you import data without Heroku as a peer? (Yes, I know you want Heroku as a peer, I'm just trying to debug things also).

arthurmilliken commented 4 years ago

@amark When I run the same import without remote peer, write speed is about the same -- about 2.0 seconds per batch of 100 "rows" with ~10k "rows" in db, with speed degrading as rows are inserted.

It's worth noting, however, that my heroku peer is not actually getting any data (s3 bucket is still empty). Looks like I need to explicitly install AWS into my herkou app:

2019-07-29T20:59:26.907908+00:00 app[web.1]: ReferenceError: AWS_SDK_NOT_INSTALLED is not defined
2019-07-29T20:59:26.907910+00:00 app[web.1]: at Object.next (/app/lib/rs3.js:18:3)
2019-07-29T20:59:26.907912+00:00 app[web.1]: at Object.next (/app/lib/store.js:5:13)
2019-07-29T20:59:26.907914+00:00 app[web.1]: at Function.onto [as on] (/app/gun.js:198:41)
2019-07-29T20:59:26.907916+00:00 app[web.1]: at Function.Gun.create (/app/gun.js:687:10)
2019-07-29T20:59:26.907918+00:00 app[web.1]: at new Gun (/app/gun.js:655:15)
2019-07-29T20:59:26.907920+00:00 app[web.1]: at Gun (/app/gun.js:654:39)
2019-07-29T20:59:26.907922+00:00 app[web.1]: at /app/examples/http.js:19:12
2019-07-29T20:59:26.907924+00:00 app[web.1]: at Object.<anonymous> (/app/examples/http.js:23:2)
2019-07-29T20:59:26.907927+00:00 app[web.1]: at Module._compile (internal/modules/cjs/loader.js:776:30)
2019-07-29T20:59:26.907929+00:00 app[web.1]: at Object.Module._extensions..js (internal/modules/cjs/loader.js:787:10)
2019-07-29T20:59:27.014848+00:00 app[web.1]: Hello wonderful person! :) Thanks for using GUN, feel free to ask for help on https://gitter.im/amark/gun and ask StackOverflow questions tagged with 'gun'!
2019-07-29T20:59:27.104942+00:00 app[web.1]: AXE enabled.
2019-07-29T20:59:27.111093+00:00 app[web.1]: aws-sdk is no longer included by default, you must add it to your package.json! `npm install aws-sdk`.
2019-07-29T20:59:27.113679+00:00 app[web.1]: /app/lib/rs3.js:18
2019-07-29T20:59:27.113683+00:00 app[web.1]: AWS_SDK_NOT_INSTALLED;
2019-07-29T20:59:27.113685+00:00 app[web.1]: ^
amark commented 4 years ago

@arthurmilliken oh shoot, you're right, I removed AWS and put it in devDependencies

https://gun.eco/docs/Using-Amazon-S3-for-Storage

^ You need to log into Heroku and set environment variable NPM_CONFIG_PRODUCTION to false in the web dashboard. This tells NPM/Heroku to keep devDependencies around (and therefore AWS-SDK).

I'll want to look into the speed degrading too, that isn't good. My target (which previously was working) is to sustain about ~2K to ~3K writes/second on a Macbook Air.

arthurmilliken commented 4 years ago

I already have NPM_CONFIG_PRODUCTION set to false

arthurmilliken commented 4 years ago

Heroku stack trace after forking repo and connecting to GitHub:

2019-07-29T21:30:43.131587+00:00 app[web.1]: 
2019-07-29T21:30:43.131603+00:00 app[web.1]: <--- Last few GCs --->
2019-07-29T21:30:43.131605+00:00 app[web.1]: 
2019-07-29T21:30:43.131608+00:00 app[web.1]: [18:0x35bc860]   175744 ms: Mark-sweep 242.5 (257.7) -> 242.0 (257.9) MB, 770.2 / 0.0 ms  (average mu = 0.093, current mu = 0.004) allocation failure scavenge might not succeed
2019-07-29T21:30:43.131610+00:00 app[web.1]: [18:0x35bc860]   176920 ms: Mark-sweep 242.8 (257.9) -> 242.2 (258.2) MB, 1173.0 / 0.0 ms  (average mu = 0.039, current mu = 0.002) allocation failure scavenge might not succeed
2019-07-29T21:30:43.131612+00:00 app[web.1]: 
2019-07-29T21:30:43.131614+00:00 app[web.1]: 
2019-07-29T21:30:43.131616+00:00 app[web.1]: <--- JS stacktrace --->
2019-07-29T21:30:43.131618+00:00 app[web.1]: 
2019-07-29T21:30:43.131621+00:00 app[web.1]: ==== JS stack trace =========================================
2019-07-29T21:30:43.131623+00:00 app[web.1]: 
2019-07-29T21:30:43.131626+00:00 app[web.1]: 0: ExitFrame [pc: 0x1318979]
2019-07-29T21:30:43.131628+00:00 app[web.1]: 1: StubFrame [pc: 0x13199e5]
2019-07-29T21:30:43.131630+00:00 app[web.1]: Security context: 0x3652ec340911 <JSObject>
2019-07-29T21:30:43.131634+00:00 app[web.1]: 2: /* anonymous */ [0x32898123abf1] [/app/lib/radisk.js:~157] [pc=0x3928aeff569c](this=0x3f08e5aec309 <JSGlobal Object>,0x251c8e84f051 <String[22]\: \x1b"en\x1b>\x1b+1563836278272\x1b>,0x1211dd64af29 <String[29]\: jyezqrdpKdbHid1Zp2wz\x1blanguage>,0x14ce35882f99 <String[#8]: language>,0x1211dd64ab01 <JSArray[7]>)
2019-07-29T21:30:43.131636+00:00 app[web.1]: 3: map [0...
2019-07-29T21:30:43.131638+00:00 app[web.1]: 
2019-07-29T21:30:43.131650+00:00 app[web.1]: FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
2019-07-29T21:30:43.131717+00:00 app[web.1]: 
2019-07-29T21:30:43.142107+00:00 app[web.1]: Writing Node.js report to file: report.20190729.213043.18.0.001.json
2019-07-29T21:30:43.142159+00:00 app[web.1]: Node.js report completed
2019-07-29T21:30:43.142814+00:00 app[web.1]: 1: 0x9afed0 node::Abort() [/app/.heroku/node/bin/node]
2019-07-29T21:30:43.143348+00:00 app[web.1]: 2: 0x9b1066 node::OnFatalError(char const*, char const*) [/app/.heroku/node/bin/node]
2019-07-29T21:30:43.143912+00:00 app[web.1]: 3: 0xb09f1e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/app/.heroku/node/bin/node]
2019-07-29T21:30:43.144472+00:00 app[web.1]: 4: 0xb0a299 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/app/.heroku/node/bin/node]
2019-07-29T21:30:43.145089+00:00 app[web.1]: 5: 0xce71a5  [/app/.heroku/node/bin/node]
2019-07-29T21:30:43.145708+00:00 app[web.1]: 6: 0xcf29eb v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [/app/.heroku/node/bin/node]
2019-07-29T21:30:43.146332+00:00 app[web.1]: 7: 0xcf3707 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/app/.heroku/node/bin/node]
2019-07-29T21:30:43.146977+00:00 app[web.1]: 8: 0xcf6238 v8::internal::Heap::AllocateRawWithRetryOrFail(int, v8::internal::AllocationType, v8::internal::AllocationAlignment) [/app/.heroku/node/bin/node]
2019-07-29T21:30:43.147592+00:00 app[web.1]: 9: 0xcbfb47 v8::internal::Factory::NewFillerObject(int, bool, v8::internal::AllocationType) [/app/.heroku/node/bin/node]
2019-07-29T21:30:43.148304+00:00 app[web.1]: 10: 0xf96708 v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*, v8::internal::Isolate*) [/app/.heroku/node/bin/node]
2019-07-29T21:30:43.149082+00:00 app[web.1]: 11: 0x1318979  [/app/.heroku/node/bin/node]
2019-07-29T21:30:43.316409+00:00 app[web.1]: Hello wonderful person! :) Thanks for using GUN, feel free to ask for help on https://gitter.im/amark/gun and ask StackOverflow questions tagged with 'gun'!
2019-07-29T21:30:43.429114+00:00 app[web.1]: AXE enabled.
2019-07-29T21:30:43.738186+00:00 app[web.1]: Relay peer started on port 6708 with /gun
2019-07-29T21:30:43.748146+00:00 app[web.1]: Multicast on 233.255.255.255:8765