Trixarian / Alia

A Pyborg fork with additional features
14 stars 8 forks source link

Doesn't reconnect #3

Closed lepusfelix closed 11 years ago

lepusfelix commented 11 years ago

I've found that when something happens to the connection (i.e bad netsplit or my router restarts etc), my alia doesn't come back. it stays as an active process on the computer, but doesn't seem to retry the IRC connection. This means the only way to get it back is to kill the pyborg process, causing the alia launch script to kick in, which resurrects the bot.... without anything it learned since the last save.

lepusfelix commented 11 years ago

In addition, it also seems to forget stuff while up. not just on reconnect. Here's some calls I made the last few days.

21:09:09] !save [21:09:09] Dictionary saved [21:09:12] !words [21:09:12] I know 3081 words (106721 contexts, 34.64 per word), 18420 lines.

the following day:

00:48:51] !words [00:48:51] I know 2960 words (104091 contexts, 35.17 per word), 18333 lines.

I might have a nose through the .py files to see if there's any pruning (I vaguely recall there being some mention of pruning at some point, but I haven't spotted it more recently. Possibly a different bot I no longer run).

EDIT: I found an auto-purge feature, which I've temporarily commented out. Gonna see how it goes. So the additional issue is closed for the time being :)

EDIT II: I've also managed to integrate a couple more features from elsewhere. A targeted learner so that it can learn from a specified nick, and also URL reading.. feed it an URL, it reads all the text (though doesn't parse out html.. good idea to use raw text pages if possible). I'll be happy to share if you want to implement them.

Trixarian commented 11 years ago

You call it purging, I call it optimizing :smile: It runs it every 5 hours to keep the database relevant by only saving the highest rated Markov chains for a specific word and dropping the ones that aren't as highly rated. All this really means is that it makes the bot sound smarter when it sees words it has highly rated matches for while avoiding weak ones. That and I'm OC about keeping my word lists optimized.

As for the previous thing, it's caused by the timer threads not dying properly and have been a continuous pain in the ass for a long time now. Maybe I should just use one that checks for events amongst the other three. Will still be a bitch to kill though.

As for improvements, send them my way because I could always use them :smiley:

lepusfelix commented 11 years ago

I'll probably uncomment the optimization at some point. At the moment, though, it's causing mischief with the bot learning things like Shakespeare, Star Wars scripts and the Bible. All the unusual words and language features disappearing... but I'd imagine after a few months of training, they'll probably become stronger matched and thus safe from the cull.

I'm willing to send pyborg.py and pyborg-irc.py (which is where most of the changes are. There's an extra line or two in pyborg-irc.cfg, but those would be generated from the other two). Due to having tweaked and fooled around with some of the changes myself, they'll probably need a review because I'm level 0 in python-fu (but appear to be working correctly). All I need is a place to share the files to you, and I can get right on it. It seems the targeted learning practically converted the bot to need to learn from one target all the time. No target, no learn. My own change there was to add in the ability to learn from everyone if there is no target specified, making the feature optional. But on absolutely no python expertise, there's a good chance of that being badly implemented (it does work fine though).

Trixarian commented 11 years ago

True, the bot technically doesn't need it. I would recommend the database rebuild every few days though because that seems to keep the database error free most of the time.

I wanted to implement timed user targeting that makes the bot respond to one user (or multiple users) by bypassing the response probability for like 2 minutes. That way the bot will respond to a user that's talking to it without the person having to draw it's attention first. I'm also thinking of leaving a pre-generated copy of pyborg-irc.cfg with the bot because everybody that uses Alia ends up popping up on my server the first time they use it.

Better learning techniques are also cool. You could just mail me files at admin@trixarian.net if you want.