Open wgliang opened 1 year ago
This is not an easy problem to solve with replication factor = 3 (default), cause you will need to merge multiple ingesters data before uploading to the object storage. Obviously you can run Mimir with replication factor = 1 to not have this issue, but at the cost of no high-availability.
This is not an easy problem to solve with replication factor = 3 (default), cause you will need to merge multiple ingesters data before uploading to the object storage. Obviously you can run Mimir with replication factor = 1 to not have this issue, but at the cost of no high-availability.
Thanks Reply. Are there other solutions being considered? The cost of using object storage for the multi-copy mechanism is indeed very high, especially affecting the performance of reading data from object storage.
Multiple copies in the object storage are stored only for a short period, because after they're uploaded by ingesters the compactor will deduplicate them, deleting source blocks and uploading compacted blocks.
Also, Mimir default configuration is setup so that non compacted blocks won't be practically queried (only compacted ones), so it shouldn't affect your read performance.
Let me add an explanation: The impact of reading here is not to read multiple copies of data from the object storage, but the read and write bandwidth of the object storage itself is shared, assuming it is 10Gbit/s, because the upload of multiple copies of data may occur in a certain For a period of time (period of 10 minutes or longer), the bandwidth is full or most of the bandwidth(maybe 8Gbit/s) resources are occupied. If there is a read request for historical data at this time, it will be affected.
Ok, now I understand what you mean. The blocks upload from all ingesters (which happens nearly at the same time) may saturate the network, affecting read performances. Makes sense.
What if we throttle the uploads (or apply jitters) to spread the block uploads from ingesters over a larger period of time?
Is your feature request related to a problem? Please describe.
From the perspective of the current architecture, when we set up three copies, the uploaded data will almost fill up the bandwidth of the entire object storage. If we can only upload a complete data to the object storage, the cost (whether it is the object storage cloud resources or the performance of the compactor) will be greatly reduced.
Describe the solution you'd like
For any metric, consider electing one of the
ingester
nodes to complete the data upload.Describe alternatives you've considered
How to ensure data integrity?
Additional context