joshuaskelly / twitch-observer

Turn Twitch chatter into Python events
MIT License
26 stars 6 forks source link

Changing the way TwitchChatEvent is initialized #59

Closed senpos closed 6 years ago

senpos commented 6 years ago

Now, every time TwitchChatEvent is created - a brand-new dictionary is getting constructed, which is kind of a waste.

Also, name of the chat events type container (TwitchChatEventType) is too detailed (same about TwitchChatColor, but that should go to another issue). Is there any special reason to not just strip out the unecessary words from the class name? Let's say, it could be ChatEvents and live in different module, like types.py, which will cause pretty intuitive and clean imports from twitchobserver.types import ChatEvents. The actual names might be discussed, but you get the idea.

Returning to the initial topic, since we already have container with key-value pairs, we can use it in __init__ to pick the right event. To achieve these behavior, we can transform our TwitchChatEventType to Enum, which will have all the things you have now and additional features, like iterating over it or getting the value with key.

I already mentioned an example here:

from enum import Enum

class ChatEventTypes(Enum):
    JOIN = "JOIN"
    LEAVE = "PART"
    MESSAGE = "PRIVMSG"
    MODE = "MODE"
    CLEARCHAT = "CLEARCHAT"
    HOSTTARGET = "HOSTTARGET"
    NOTICE = "NOTICE"
    RECONNECT = "RECONNECT"
    ROOMSTATE = "ROOMSTATE"
    USERNOTICE = "USERNOTICE"
    USERSTATE = "USERSTATE"
    WHISPER = "WHISPER"
    COMMAND = "COMMAND"

So, we can get rid of this useless dict in __init__, which will look like this now:

  def __init__(self, channel=None, command=None, message=''):
        ...
        try:
            self.type = ChatEventTypes(command)
        except ValueError:
            self.type = ChatEventTypes.COMMAND
        ...

We can also make it work with both str and ChatEventTypes by adding some runtime checks:


    def __init__(self, channel=None, command=None, message=''):
        if isinstance(command, ChatEventTypes):
            self.type = command
        elif isinstance(command, str):
            try:
                self.type = ChatEventTypes(command)
            except ValueError:
                self.type = ChatEventTypes.COMMAND
        else:
            raise TypeError(f'"command" expected to be str or ChatEventTypes, but {type(command)} was given')

This will also make self.type actually typed as ChatEventTypes.

Or we can take advantages of static typing:

def __init__(self, ..., command: typing.Union[str, ChatEventTypes] = None, ...):
    ...

All of this is surely debatable, but I think - worth mentioning. Refactoring is a nice way to keep code modern and powerful.

And 100% I can be wrong somewhere, so I will be happy to see what are your thoughts on this!

joshuaskelly commented 6 years ago

These are interesting ideas. I've spent some time today thinking about your suggestions. Let me try to shed some light on why we made the decisions that we did:

1. Now, every time TwitchChatEvent is created - a brand-new dictionary is getting constructed, which is kind of a waste.

How is it a waste? This is a common pattern in Python in place of a switch statement. Dictionaries are built-in data types and are quite performant. Thinking about this, the logic to convert from an IRC command to chat event type it might make more sense living in the TwitchChatEventType class? Maybe something like TwitchChatEventType.from_command(command) might be clearer?

2. Also, name of the chat events type container (TwitchChatEventType) is too detailed...

Maybe so? I could be swayed to change the names. πŸ˜„

3. Use Enums!

Concerning the usage of Enums, I believe that @PythooonUser originally wanted to use enums for message types but I argued against it. From the various Python event message systems I've encountered(Pygame, Pyglet, tdl), the message.type attribute is always a string. We went that route for consistency and ease of use. For example (taken from my lunch-break-rl project):

# Handle input/events
for event in list(tdl.event.get()) + self.observer.get_events():
    if event.type == 'foo':
        pass

Makes it easy to homogeneously handle messages from different sources. The long type names are intended to avoid type collisions with other package's event systems, but I could be swayed on that too. πŸ˜„

4. Or we can take advantages of static typing...

Type hints are a strictly Python 3.5+ language feature and I think we should support Python 2.7 at least until it is retired.

5. And 100% I can be wrong...

Not at all! Thank you for your involvement in this project and bringing up ways it can be improved. πŸ˜ƒ

senpos commented 6 years ago

How is it a waste?

Imagine how many events are getting created every minute πŸ˜‹ Depends on a channel, for sure. Since the dict contains only static data it is probably worth to place it somewhere else, outside of __init__. This should avoid a lot of unnecessary operations. Actually, I don't think it could be bottleneck, but maybe?.. LOL

the logic to convert from an IRC command to chat event type it might make more sense living in the TwitchChatEventType class? Maybe something like TwitchChatEventType.from_command(command) might be clearer?

Types that live inside of TwitchChatEventType are just the constants that you converted manually to twitch-observer format (e.g. JOIN -> TWITCHCHATJOIN), aren't they? I can't get the point here why you shouldn't just use original ones instead of your custom ones. On the other hand, .from_command method looks nice and clear, but again, since we use types, which were originally the same constants as commands (right?) - it is easier to pack those constants in a one place and use it as we want.

Maybe so? I could be swayed to change the names.

What about TwitchChatEventType -> ChatEvents or ChatEventTypes? Twitch is not necessary here, since all of us are here to work with Twitch πŸ˜„ Also, making it plural says that class is actually container with constants, something must live there, it is not just a ChatEvent type itself.

Enums

I am not familiar with projects like Pygame or Pyglet, but assume, that they had no such cool tools to build their project with 😸 They are pretty old, haha. Or they started it wrongly and couldn't break API due to the big amount of active users. IMHO

Actually, using constants from TwitchChatEventType might be a good idea to make it safe and consistent. I will give you more details on that:

  1. Using TwitchChatEventType everywhere makes library to feel consistent. Recently on_event decorator was added (thanks to @PythooonUser ) and it shows, that using TwitchChatEventType to pass the needed type is very straightforward and convenient, so why do we want to operate strings internally, when we can do all the stuff with precisely described constant containers.

  2. Using TwitchChatEventType makes it safer, we avoid typos. You don't want to type HOSTTARGET ever, because if you misspell it (which is pretty easy) as HOSTARGET - we won't find it in the dict and wrongly assume it as a command.

  3. Correlates with 2 a little - your IDE or text editor will auto-complete available types for you wherever you use them. That will not happen with strings.

  4. Comparisons and lookup in enums are just too easy and fun. You won't have to deal with "dict" -> "select" simulating. And you will also get some other benefits, which might not be important in a short run, but useful in a long run (e.g. iteration over enum for colors).

I think we should support Python 2.7 at least until it is retired.

Fair enough. Even if i don't think that many developers start their projects with 2.7 now, I see supporting 2.7 as viable option to cover bigger amount of end library users.

Actually, mypy understands comment-based types, but it is too ugly, IMO. πŸ˜‹

PythooonUser commented 6 years ago

Thanks @Senpos for the involvement.

Dict

To get back to the TwitchChatEventType, I don't know the performance impact difference between having an instance or class variable. Maybe someone should do a stress test? :D

Naming

I see your point with the naming scheme and it would be an easy fix to make this more accessible to the user. We already have aliases for the TwitchChatEventType and TwitchChatColor classes in the __init__.py file. Extending this also to the variable names seems straight forward.

Enums

I see the interest in enums as well, but using strings was a clear design decision, as @joshuaskelly already mentioned. Having compatibility with e.g. the tdl package was our first use case for the project. I don't know, but I remember comparing enums with strings isn't straight forward. Doesn't one need to call a value attribute first?

In addition we have simplicity in mind for the twitchobserver and maybe it's not a good idea to force using enums on every user? Is there a way to have both?

senpos commented 6 years ago

but I remember comparing enums with strings isn't straight forward. Doesn't one need to call a value attribute first?

Yep, that's correct. But since you use strings in that many places, enums might not be a good choice. :)

senpos commented 6 years ago

Dict To get back to the TwitchChatEventType, I don't know the performance impact difference between having an instance or class variable. Maybe someone should do a stress test? :D

I am sorry, but I did not quite understand your whole message, but the part about benchmark: I did a little test script, which might be not that precise, but probably shows what you can expect from moving command_to_type dict away from __init__ body.

Simulation scripts

1. The way how it works now gist | ~12.2s

2. Move command_to_type dict away from __init__ body gist | ~5,5s

3. Same as 2, but put command_to_type inside of TwitchChatEvent as class-variable No need to show the gist, results are roughly the same as in 2

So, the point here - I don't see any reason to store (actually, create the new one every time, when Event is fired) the dict with constants inside of __init__, but rather in class or in constants module.

Again: this "benchmark" is quick & dirty, includes only PRIVMSG event (which is most used, i think), but doubling the speed even with this script is interesting.

joshuaskelly commented 6 years ago

I think we should create the two issues below and close this one. Thoughts?

  1. Shorten naming of ChatEventTypes. Basically (but not always) use the command sent from Twitch.
  2. Pull out the command to type dictionary into a private module level attribute.

@Senpos and @PythooonUser if you have more thoughts on this please elaborate. Otherwise give me a thumbs up and I'll write up the above issues. πŸ˜„

senpos commented 6 years ago

Pull out the command to type dictionary into a private module level attribute.

Or to the class variable

class TwitchChatEvent:
    ...
    command_to_type = { ... }
    def __init__(self, ...):
    ...
PythooonUser commented 6 years ago

Let's do it!

joshuaskelly commented 6 years ago

Cool! I've written up the issues. If you feel strongly about fixing a particular issue, please assign it to yourself and knock it out! πŸ˜„

PythooonUser commented 6 years ago

Can this issue be closed then?

PythooonUser commented 6 years ago

Closing, because issues have been created.