https://github.com/apache/pinot/pull/9295 enabled consistent data push for standalone execution framework. This would be a great feature to extend to Spark based ingestion as well.
This will be useful for scenarios for our users where every run of a batch job may produce a different number of partition files and an atomic replace of one set of segments with another will help mitigate the issue of serving duplicate data.
https://github.com/apache/pinot/pull/9295 enabled consistent data push for standalone execution framework. This would be a great feature to extend to Spark based ingestion as well.
This will be useful for scenarios for our users where every run of a batch job may produce a different number of partition files and an atomic replace of one set of segments with another will help mitigate the issue of serving duplicate data.