Closed nichoth closed 2 years ago
Ooooh I just realized that there is a chance you're running into deweird issues. Do you happen to have a thread or "worker" process or something like that in between the HTML rendering and the ssb-threads query?
Technically the take(10)
is the correct solution and I don't know why it would take too much memory. Would require profiling and debugging to figure it out. I would much rather fix the root cause of this performance issue than find another way to query the API.
Thanks @staltz for the tip about ssb-deweird
. That will be the next thing I look into. There is no separate process or anything, and no html rendering either; this is all happening in node, just returning results via an HTTP API.
update
I tried using deweird/producer
but still having the out of memory error -- https://github.com/planetary-social/planetary-pub/blob/deweird/pub.js#L148
This is using an sbot that is in the same process as the 'consumer', so I think the weirdness is not an issue here.
also
I was using this like
var source = sbot.threads.profile({
id: userId,
})
and it was working.
But when i changed it to
var source = sbot.threads.profile({
id: userId,
allowlist: ['post']
})
then it crashes/runs out of memory
Yeah, just installing deweird/producer will have zero effect on it.
Can you try to use a profiler to get more information on the memory leak and the performance problems? Like if you start node.js with --inspect-brk
and use chrome devtools profiler.
@staltz
I did get a debugger to run with the program via --inspect-brk
and then opening chrome://inspect/#devices
. However I'm not sure how best to communicate the information
It is displaying a function mergeFilters
in jitdb/index.js
Sorry I have never debugged in this way -- async and remotely -- before
This is the repo in question --
https://github.com/planetary-social/planetary-pub/blob/out-of-mem/viewer/index.js#L84
Out of curiosity (and because this could be the cause of the OOM), what is the size of the log? I mean db2/log.bipf
file.
816 MB
total 1670272
drwxr-xr-x 25 nick staff 800B Feb 7 16:10 indexes
-rw-r--r-- 1 nick staff 816M Feb 7 16:09 log.bipf
Is this running on a VPN? What is the RAM capacity of the machine?
more clues
This is not a VPN. This is just on my local laptop machine.
I had been starting this with a limit on memory use of 512 MB like so --NODE_ENV=staging-local node --max-old-space-size=512 index.js
. That would crash.
Then I tried starting it without a limit on memory -- NODE_ENV=staging-local node index.js
. My machine has 8GB of memory. These results are more complicated. The first request I made to the endpoint would take a long time, but eventually it would return successfully. Then if I make a second request to that endpoint it will run out of memory and crash.
This is the endpoint
fastify.get('/feed-by-id/:userId', (req, res) => {
var { userId } = req.params
var source = sbot.threads.profile({
id: userId,
allowlist: ['post'],
threadMaxSize: 3 // at most 3 messages in each thread
})
S(
source,
S.take(10),
S.map(thread => {
// if it's a thread, return the thread
// if not a thread, return a single message (not array)
return thread.messages.length > 1 ?
thread.messages :
thread.messages[0]
}),
S.collect(function (err, threads) {
if (err) return console.log('err', err)
res.send(threads)
})
)
})
This conversation can keep on going, but there's nothing actionable for a maintainer to do, so I'll close.
Is there a way to paginate this query so that it only returns something like the first 10 threads/thread-summaries?
Intuitively I would do something like this, but when doing this it seems to use too much memory and takes a long time to return.