we see a non-zero amount of MessageTooLarge errors
these are all (so far on sampling inspection of the data) from old clients that didn't have batching code and would sometimes send a lot of data in one go
around 1 in 5 of them have many hundreds of items to process
1 in 25 has tens of thousands of items
we already have code that should be splitting these out into individual events
but clearly it's not working
and we don't really want one API call to generate 10k kafka messages 🙈
so this PR
changes how we check the headroom - we're clearly under counting, this might help
i looked at how the data is going to be sent to kafka and tried to copy that so that we're counting bytes, and counting a similar bytes array, instead of counting characters, JS in the browser uses UTF-16 string and kafka/python is using UTF-8 so maybe there's some silliness happening here
split the list instead of exploding it
the final case in the processing if the non-full snapshots won't fit into headroom sends every item from the list individually
instead now, we keep splitting the list into 2 and checking the size of each half
in theory this means the majority case is we'll split into one or two messages each with many events
we see a non-zero amount of MessageTooLarge errors
these are all (so far on sampling inspection of the data) from old clients that didn't have batching code and would sometimes send a lot of data in one go
around 1 in 5 of them have many hundreds of items to process 1 in 25 has tens of thousands of items
we already have code that should be splitting these out into individual events
but clearly it's not working
and we don't really want one API call to generate 10k kafka messages 🙈
so this PR
changes how we check the headroom - we're clearly under counting, this might help
i looked at how the data is going to be sent to kafka and tried to copy that so that we're counting bytes, and counting a similar bytes array, instead of counting characters, JS in the browser uses UTF-16 string and kafka/python is using UTF-8 so maybe there's some silliness happening here
split the list instead of exploding it
the final case in the processing if the non-full snapshots won't fit into headroom sends every item from the list individually
instead now, we keep splitting the list into 2 and checking the size of each half in theory this means the majority case is we'll split into one or two messages each with many events