This include a check of the cache information for a given table. If the regions inferred from cache do not match the partition folders found on the filesystem then all cache information for that table is ignored and it is ready directly from the Parquet files.
This issue it to prevent cache becoming invalid in the case of a failure between writing out the table data and writing new cache information.
Type of change
Please delete options that are not relevant.
[X] Bug fix (non-breaking change which fixes an issue)
[ ] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
[ ] This change requires a documentation update
How Has This Been Tested?
Unit tests.
I have tested the cases of:
Cached regions have one more region than found in the filesystem
Cached regions have one fewer region that found in the filesystem
I both cases I kept one valid table without bad cache information to ensure good tables don't lose the cache optimisation.
Description
This include a check of the cache information for a given table. If the regions inferred from cache do not match the partition folders found on the filesystem then all cache information for that table is ignored and it is ready directly from the Parquet files.
This issue it to prevent cache becoming invalid in the case of a failure between writing out the table data and writing new cache information.
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Unit tests. I have tested the cases of:
I both cases I kept one valid table without bad cache information to ensure good tables don't lose the cache optimisation.