richfromm / slack2discord

A Discord client that imports Slack-exported JSON chat history to Discord channel(s).
GNU General Public License v3.0
8 stars 3 forks source link

parser is missing messages posted by bots #13

Closed richfromm closed 2 years ago

richfromm commented 2 years ago

We are searching for messages (see SlackParser.parse_file() in parser.py with:

if 'user_profile' in message and 'ts' in message and 'text' in message:

then getting the name based on:

                    real_name = message['user_profile']['real_name']

The problem is that bots don't have user profiles.

A bot has a bot_id, but that's probably irrelevant.

Note that all messages have the following:

This comes from https://slack.com/help/articles/220556107-How-to-read-Slack-data-exports#how-to-read-messages

So what we probably ought to do at the highest level is look for messages just based on type being set to message.

If there is a user_profile, that's great, we can use it, although in retrospect, it might be better for us to post the messages to Discord using display_name rather than real_name (both of these are within the user_profile).

But back to the bug... Note that we can also get the info for real users from users.json (in the export), but that only includes real users, not bots, so it doesn't really help us.

Normal user ID's seem to be of the form Uxxxxxxxx, where the x's are alphanumeric characters.

The only bot example I have is for user USLACKBOT. My proposal is that if there is no user_profile, then look to user. Strip a leading U off of the user if present, and use the rest of the string. If it happens to not start with a U, then just use the entire string.

And maybe as a final fallback, if there is no user (even though the docs claim there always is), log a warning, and use the string ???

I have no clue if perhaps the slackbot is special, and maybe other user-defined bots actually are in users.json. I would need more data for that, the slack documentation isn't sufficient.

richfromm commented 2 years ago

display_name might be present but the empty string. I think that should be the priority, and if that is not present then real_name. There's also name (not sure of that significance precisely), but I suppose that could be another fallback. These are all within user_profile.

Then user is just the final fallback if there is no user_profile.