Open dongxiaoman opened 2 years ago
In the current version of helix that we use in Pinot, data over 1M is automatically compressed by Helix before writing to the zookeeper. I think your compressed data is exceeding this limit (helix provides only one limit in 0.9.x, we have requested to have two limits and they will provide it in 1.x)
The best way is to remove some segments or set your segment size to be larger so that less number of segments are made.
I just realized that it is probably because I have 192 partitions and "replication": "2"
setting, while I have only 4 hosts to process them. I believe the real time servers is trying to go online for all the 192*2/4 segments at the same time (in one message) which generates a very large JSON file that exceeds the 1MB size.
Right now if I add more hosts it should solve the problem for me.
This is not critical but shows some problems worth investigating.
Our QA cluster is having trouble with some accumulated data. It has 10k+ real time segments in place, data are not too much.
In logs we see something like below: