Open geeknik opened 9 years ago
This sometimes happens with larger databases. You can reduce ai.ngram.depth
to be less than 8.
With larger databases, 7, 6, or even 5 should not make too much (if any) difference in the quality of the answers.
I dropped it from 8 to 7 and it appears that the 'call stack' issue (in bfs.js) has gone away while I have ai.ngram.length set to 4. If I change ai.ngram.length to 3, the call stack issue in bfs.js makes a return. I tried to set ai.ngram.depth to 6 and 5, but I still see the call stack issue reference line 18 in bfs.js.
Just to make sure it wasn't something on my end, I ran ulimit -s unlimited (as the stack was limited to 8192), but setting it to unlimited didn't make a difference.
What if you set depth to 5? thats 5 words in a row between keywords and start/end, and there should be at least 2 keywords per sentence which limits the total size of the reply to at least 15 words (20 for 3 keywords, 25 for 4 etc) - which is completely okay...
ngram length of 4 is definitely recommended once the db reaches a large enough size - it will improve the fluency of the replies significantly
Ideally most of these parameters would automatically tweak themselves but I haven't had time to think about how that should be done...
I have ai.ngram.length set to 4 and ai.ngram.depth set to 5 and I haven't seen the error. Not sure if setting the ngram length to 4 is correct yet ({"words":{"counter":12382},"associations":{"counter":180188},"ngrams":{"counter":83633}}).
Seems a bit early for length 4 but it shouldn't be a problem. It would be weird for length=3, depth=5 to stackoverflow though, as depth is exponentially proportional to the number of explored nodes
So I've fed my triplie install quite a bit more data:
{"words":{"counter":439415},"associations":{"counter":7879231},"ngrams":{"counter":5429584}}
And now I'm seeing this in the logs every time the bot tries to formulate a response:
creativity expansion complete
/usr/lib/node_modules/triplie/node_modules/async/lib/async.js:24 fn.apply(root, arguments); ^ RangeError: Maximum call stack size exceeded Child exit with status code 8 , reloading
I set ai.ngram.length to 3 and ai.ngram.depth to 5 and I get this error when the bot tries to reply:
/usr/lib/node_modules/triplie/node_modules/async/lib/async.js:0 (function (exports, require, module, filename, dirname) { /*global setImme ^ RangeError: Maximum call stack size exceeded Child exit with status code 8 , reloading
The database file as it resides on disk is about 790MB (and growing). The bot has been moved to a new server (6 core Xeon, 24GB RAM), but with this huge data set, the bot just doesn't appear able to cope.
Hrm, thats a lot more data so its not as unexpected. To fix this problem I'll probably have to rework the bot to use bluebird promises instead. Unfortunately I don't have the time for that at the moment.
Btw, at that database size ngram length of 4 is definitely the way to go though.
Another easy fix would be to try changing the stack size when running the bot by adding '--stack_size', '32768',
at https://github.com/spion/triplie-ng/blob/master/bin/bot.js#L114 - that should give it 32MB stack (33 times more than the default) which is not too much on a beefy machine
Increasing the stack size is helpful and I can get the bot to reply, but it's painfully slow (even with putting the database on a ramdisk), that it's not too useful as a conversationalist. It is still useful for me to mess around with though. Thanks for the help.
No problem. Though with a database of that size, ngram lengths of 4 (or even 5) are the only ones that make sense, and should be fast enough
I'm hoping to one day rewrite triplie with promises - that should improve performance and get rid of issues like the stack being blown
This happens sometimes when the bot is trying to form a response, usually right after I see "creativity expansion complete".
/usr/lib/node_modules/triplie/node_modules/async-bfs/lib/bfs.js:18 moves(depth, node, function(err, newNodes) { ^ RangeError: Maximum call stack size exceeded Child exit with status code 8 , reloading child: connecting to ipc channel { path: '/tmp/triplie-24397.sock' } parent: child process connected
This one happens a bit less often:
/usr/lib/node_modules/triplie/node_modules/async/lib/async.js:21 return function() { ^ RangeError: Maximum call stack size exceeded Child exit with status code 8 , reloading child: connecting to ipc channel { path: '/tmp/triplie-24397.sock' } parent: child process connected
Any ideas on how I can fix these?