dask / distributed

A distributed task scheduler for Dask
https://distributed.dask.org
BSD 3-Clause "New" or "Revised" License
1.55k stars 712 forks source link

Improve structure of event logging #8688

Open hendrikmakait opened 2 weeks ago

hendrikmakait commented 2 weeks ago

There are two main things that I dislike about the current way we handle event logging:

hendrikmakait commented 2 weeks ago

Another thought on topics: I think they should be "static", i.e., don't depend on the actual cluster. For example, we have https://github.com/dask/distributed/blob/9672121ce115df9268b12ad74e108d71ec8104c0/distributed/scheduler.py#L5916-L5924 which logs to a topic that depends on the actual address assigned to a worker at runtime. This makes writing consumer code for events and topics more difficult. Feel free to disagree.

fjetter commented 2 weeks ago

A couple of high level thoughts

I feel most of your concerns would already be addressed by establishing a couple of sensible, internal best practices and moving our code towards this.


a couple of specifics

Having a catch-all "all" topic defeats the purpose of a topic-based

yes. If all was supported, this felt like a thing that should be implemented on the caller/subscriber but keeping the events twice is redundant.

Another thought on topics: I think they should be "static", i.e., don't depend on the actual cluster. For example, we have

I think this is a case where it would make sense to have two different event streams. I agree that every event should be piped to a static topic but I don't mind having a worker specific stream. For debugging this is useful and it doesn't cause any harm.

hendrikmakait commented 2 weeks ago

Another thought on topics: I think they should be "static", i.e., don't depend on the actual cluster. For example, we have

I think this is a case where it would make sense to have two different event streams. I agree that every event should be piped to a static topic but I don't mind having a worker specific stream. For debugging this is useful and it doesn't cause any harm.

Agreed. I should rephrase this to: A message should be available in a static topic, not only in a dynamic one.

hendrikmakait commented 2 weeks ago

I feel most of your concerns would already be addressed by establishing a couple of sensible, internal best practices and moving our code towards this.

Yup, that's largely what this issue is for :)

hendrikmakait commented 2 weeks ago

Accepting all msgpack serializable structures is something I would like to keep (not a hill I'll die upon). This system can be used by endusers and I wouldn't want to force them to use a special container to emit a message

All I'd be asking you to emit is something like {"action": "some-identifer", "<some key>": <WHATEVER>} if you didn't feel like emitting a structured message. Combined with a deprecation cycle, that feels like a low burden compared to forcing any downstream consumer to perform checks like isinstance(msg, dict) and msg.get("action", None) == "my-key" to access our average structured event just because you felt like dumping an int into an existing topic.

fjetter commented 2 weeks ago

I don't think it makes sense for users to be force to use an action keyword. That feels kind of arbitrary. I'm also not sure if this deprecation cycle would be worth the effort just to safe a consumer from an isinstance check

hendrikmakait commented 2 weeks ago

I don't think it makes sense for users to be force to use an action keyword.

FWIW, I'm also happy to require something like an action, message keyword in the log_event signature that we expose similarly in consuming functions on plugins. The way I see it, there should be some identifier telling me what type of message I'm dealing with and then some payload belonging to said message.