Closed shaunagm closed 9 years ago
Two potential rules:
1) Any nick with a pipe ("|") that matches the first half - for instance "shauna" and "shauna|away" or "shauna|work" should match.
2) Anything that ends in numbers but matches letters preceding - for instance "shauna" and "shauna1", "shauna2".
This pull request partially addresses this issue (although there are probably more rules we can create):
https://github.com/shaunagm/WelcomeBot/pull/29
Currently waiting on the PR submitter to see if they want to redo it to fit with the new format of the bot.
Going to re-implment this tomorrow, any idea of which rules we'd like to have apply? Having trouble getting the correct information to display here (markdown doesn't like angle brackets?), but the rules I applied last time were a known name, followed by either some set of numbers, or a pipe then any arbitrary text after that.
We should probably also get rid of trailing underscores. And, if we don't already, make sure capitalization is disregarded.
Is the general idea that, if a new user joins, and matches an already-met user after removing underscores, numbers, etc., we ignore them? Do we also want to greet them with a name stripped of those identifiers?
The idea is that we ignore them. WelcomeBot is only supposed to greet newcomers. In some hypothetical future we could implement a function where, if you haven't been there in a while, WelcomeBot says "Welcome back!" but I don't think that makes sense to worry about right now, since we don't currently store datetimes of when people are first greeted.
Added a basic function : https://github.com/aaparella/WelcomeBot/blob/recognize-unregistered-nicks/bot.py#L156
The idea is it would be called instead of actor.replace("_", "") in the check for whether or not a user is a newcomer. I'm sort of torn on the order in which to check for the various delimiters, and am even more torn on what to name the function (though, of course, that's not as important).
Is this the sort of thing you had in mind?
I've merged this PR but I'm leaving this issue open, as I'm sure there are more rules we could implement.
@aaparella I must have messed up on the merge, because your changes never made it through. I re-added them manually, with a comment crediting you on the commit: https://github.com/shaunagm/WelcomeBot/commit/06077c3e6350baeea02d34fc586b5446f6b5cc4f
Looks like there's a bug in this - the stripped names should be used for record keeping purposes, not to communicate with the person. Perhaps we need separate variables for actor-record and actor-nick?
Ah, I think I may have misunderstood your intention. I'll get a fix soon.
The reason seems to be the use of the clean_nicks function in parse_message. Should we be removing added identifiers at that stage? Without that I believe it functions as intended (greeting goes to current user name, not cleaned version).
Agreed - the question is where we do want to use clean_nicks/remove identifiers. Alternatively, we could do it early on but store both a cleaned and a non-stripped version and reference each as needed.
All comparisons that I can think of we would want to be between the "cleaned" nick (for checking if they are new, etc.) and then we would only want to use their "full" name when greeting them. I think it would make sense to simply create a cleaned version of the nick in the NewComer constructor and use each as needed.
That is, unless I'm missing a case in which we would want to use the full nick?
I think that the greeting is the only point at which we'd want the 'full' name, though I may be missing something. I like your plan.
Would you like to work on the fix for this issue?
I'd be happy to. Should be able to get it done tonight.
Pull request made (#48)
Merged! Thanks so much @aaparella. I think we've hit all the common ways that nicks get altered, so I'm going to close this, but anyone reading this should feel free to re-open if they find a new way to account for.
When users who have registered their nicknames join IRC without identifying, different IRC clients will change their nickname in different ways.
For instance, the default behavior in quassel is to add a trailing underscore (or two, or three) if you haven't identified to your nick. The bot currently catches this (https://github.com/shaunagm/oh-irc-bot/blob/master/bot.py#L113) but if we can determine the rules for other popular clients we can catch those as well.
To improve the bot, please:
1) Research an IRC client (ideally starting with the most popular ones) and determine the default for un-identified nicks.
AND/OR
You can look through the stored nicks for patterns.
2) Add to bot.py, in the function clean_nick(), appropriate rules to deal with that behavior.
Update: The bot now deals with trailing underscores, trailing numbers, and things with a pipe. Is there anything we're missing? The main one I see when browsing through the nicks is something like: name[Mobile]