Open marcelboldt opened 3 years ago
Take in count that we will need to preserve the order within the same request type.
I'd assume that this issue wouldn't affect ordering as long as there is only one client per request type.
What may be missing is rather a way to more explicitly determine order through configuration. I see several ways order within a request topic could be defined:
In Kafka there is implicit order through the relative position of a record in the file on disk which is the same order that is also reflected through offset per partition. Additionally a timestamp is stored with a record - default is event/producer time (the producer sets the timestamp to the send time), there are also ingestion/broker-time (when the broker received the record), processing-time (current time), payload-time (explicit timestamp definition in the record).
kafka / SOAP | offset per partition | event/producer time | ingestion/broker time | processing/current time | payload-defined time |
---|---|---|---|---|---|
request order (impl.) | |||||
response order (impl.) | |||||
asc id in SOAP data | |||||
asc id in HTTP header | |||||
timestamp in request's SOAP data | |||||
timestamp in response's SOAP data | |||||
timestamp in request HTTP header | |||||
timestamp in response HTTP header |
Firstly, I would tend to excluding HTTP header based information a the w3c soap definition wants all information a soap application acts upon in the SOAP message; SOAP is also valid if transferred via non-http protocols.
For the implicit ordering it may be implemented to preserver this order via idempotent producers, to ensure the order is kept. Question: is this order meaningful at all, or more or less arbitrarily determined by network latency between SOAP client and endpoint?
As for time-based ordering based on a timestamp info in the SOAP data: the Kafka header's timestamp field can be set to this - the fact that order is unclear if the timestamp is the same among messages should be acceptable.
Asc id: the producer would have to buffer and order the messages. What if there is a gap within the asc if of incoming data - how long should the producer keep the records and wait if subsequent records fill the gap?
@ogomezso Happy to read your feedback on this. How do soap systems usually handle / expect order? Do they have a notion, or is a SOAP record considered self-sufficient and services should be stateless?
With regards to using data contained in the SOAP data: Once started to process data contained in a SOAP message the processor becomes a SOAP node which would have to process in particular the soap header as determined in the SOAP specification chapter 2, especially https://www.w3.org/TR/soap12-part1/Overview.html#relaysoapmsg. The processing would have to be according to an explicit SOAP protocol binding: https://www.w3.org/TR/soap12-part1/Overview.html#transpbindframew
That's a big decision... While it could make sense to create a Kafka Protocol Binding for SOAP I tend to think that this is exceeding the scope of this activity (at least for now). My preferred alternative would be not to touch SOAP data - what remains is to keep the implicit order (line 1-2 in the table) if it turns out that it isn't randomly based on network characteristics.
Currently poll() is implemented in a way that serialises multiple requests which is quite inefficient. It may be explored to use an event queue to run the requests more parallelised.