Closed robrap closed 1 year ago
There are a variety of times related to event production:
time
field.datetime.now()
in the code that initiates sending the event signal.datetime.now()
in the code that initiates sending the signal to the event bus. It is doubtful that we will use both these "current times", and will instead probably choose one and document, if it is used at all.For the purposes of the rest of this discussion, we will ignore any times related to consuming the event. Any consumer can call datetime.now()
when required, and at this time, we have no need to provide this as an attribute on the event.
According to the OEP, we must implement a "time" field that could be either db_created_at_time
or signal_produced_time
. Note: I will ignore the other current time possibilities. As a consumer, possibly the only way to know which of these you are getting for a certain event would be to read the docs for the specific event.
As the OEP states, using the created_at time is ideal for the consumer for a variety of reasons. However, it is more challenging to implement. Note: maybe this is no big deal for legacy event and I just need to better understand?
My main concern is holding up the ability to pass along any of these times, being blocked on how long it would take to implement the db_created_at_time
solution, and knowing that once a ce_time
header is added, it should not be changed without causing a major version change. Additionally, this is not a one time cost for our single event. This will come up for any new events (and especially for legacy event?). I'd like to consider sending a signal_produced_time
header that will always mean the same thing, is simple to implement, and may or may not differ from ce_time
. This could enable us to defer implementing ce_time
, if we continue to be stuck on this.
Separately, confluent-kafka offers a tuple for its timestamp that provides both the time and a type of the time. We could consider enhancing the ce_time
header with that information in a separate custom header. See https://docs.confluent.io/platform/current/clients/confluent-kafka-python/html/index.html#confluent_kafka.Message.timestamp. This is not a priority. I just thought it was interesting.
As an example of implementing the production timestamp, here is the code where we get the course for the event bus event data: https://github.com/openedx/edx-platform/blob/0b68c84286fe53c32b6b8c1e01ce30845bb8496c/cms/djangoapps/contentstore/signals/handlers.py#L85. I don't see how to get the db dates for that course. Maybe others know? Or, do we think making that data available (here and in all cases) should be a blocker to sending any new event?
I'm not certain of the answer, which is why I am open to punting on this and instead using a specifically named and easily implemented field that comes from datetime.now()
. It could be documented as not necessarily matching the DB time of the related data.
UPDATE: @ormsbee and I discussed and he provided me with a solution for the DB time of this event: course.subtree_edited_on
. So, we will update to pass through this event time. Related, OpenEdxPublicSignal.send_event() and new send_event_to_bus
will need an optional time
argument to use as the metadata.
This is a continuation of https://github.com/openedx/openedx-events/issues/77, where we decided to defer the CloudEvent field ce_time, as defined in OEP-41.