superstreamlabs / memphis.py

Python client for Memphis. Memphis is an event processing platform
https://pypi.org/project/memphis-py/
Apache License 2.0
37 stars 21 forks source link

[Discussion] Typing, tests, formatting, pre-commit #105

Open msiwik-epruf opened 1 year ago

msiwik-epruf commented 1 year ago

Hey guys,

(memphis-py==0.2.7) I just finished deploying docker-compose demo, played with schema, produced and consumed messages.

I would like to contribute and learn project more. Are there any priorities or goals?

First thing i saw when I played with hello example was lacking of types. I see that sometimes bytearray is allowed something Union[bytearray, bytes] and etc, but

async def validate_msg(self, message):
        if self.connection.schema_updates_data[self.internal_station_name] != {}:
            schema_type = self.connection.schema_updates_data[self.internal_station_name]['type']
            if schema_type == "protobuf":
                message = self.validate_protobuf(message) # bytearray | Unknown
                return message
            elif schema_type == "json":
                message = self.validate_json_schema(message) # bytearray 
                return message
            elif schema_type == "graphql":
                message = self.validate_graphql(message) # bytearray | bytes
                return message
        elif not isinstance(message, bytearray):
            raise MemphisSchemaError("Unsupported message type")
        else:
            return message

Also there is on line 802, but on line 29 we import from google.protobuf.message import Message (name conflict)

class Message:
    def __init__(self, message, connection, cg_name):
        self.message = message
        self.connection = connection
        self.cg_name = cg_name
idanasulin2706 commented 1 year ago

Hi @msiwik-epruf, thanks for the willing jump in and start contribute this is awesome. Regarding the first point - in case no schema is attached to the station only bytearray is allowed, but for the different schema data formats another types allowed. For example if you have a station with JSON schema attached you can send dict.

Regarding the second issue, this is indeed an issue and we will be glad. to see you fixing it and open a PR.

Regarding other things we would like to see on the Python side, are you familiar with Celery?

msiwik-epruf commented 1 year ago

@idanasulinmemphis Hey,

Regarding the first point - in case no schema is attached to the station only bytearray is allowed, but for the different schema data formats another types allowed. For example if you have a station with JSON schema attached you can send dict.

Maybe here we can have some Union type to show which types are allowed and etc.

Regarding the second issue, this is indeed an issue and we will be glad. to see you fixing it and open a PR.

Ok

Regarding other things we would like to see on the Python side, are you familiar with Celery?

Yeah, but it depends on use case. Currently we use celery with multiple workers/ques for lot's of tasks

*

idanasulin2706 commented 1 year ago

@msiwik-epruf so we actually want Memphis to be supported for the Celery package users, is it something you can work on?

keiranjprice101 commented 1 year ago

+1 for type hints.

In a similar vein, I suggest that the library adopt the leading underscore convention for private or protected attributes in specific classes. For instance, attributes like headers.headers or message.message might be more intuitively named headers._headers or message._message. While it's ideal for everyone to thoroughly read the documentation before using the library, in practice, many developers experiment first and may be misled by their IDE's intellisense. By using a naming convention that more clearly indicates the attribute's intended visibility or scope, we can make the API more intuitive and reduce potential confusion.

I commented here instead of opening an issue as it felt in line with the discussion