Change the way of how dictionary page header is detected while reading column chunks
Changed
Removed
Deprecated
Security
Description
This PR should solve issue mentioned here: https://github.com/flow-php/flow/issues/984#issuecomment-1949891775
Turned out that some files generated by spark do not set properly the metadata, because of that dictionary page header is not properly recognized and without it whole column can't be readed.
This approach reads the first header and if it's a dictionary header it's using it to read the column dictionary.
Change Log
Added
Fixed
Changed
Removed
Deprecated
Security
Description
This PR should solve issue mentioned here: https://github.com/flow-php/flow/issues/984#issuecomment-1949891775 Turned out that some files generated by spark do not set properly the metadata, because of that dictionary page header is not properly recognized and without it whole column can't be readed. This approach reads the first header and if it's a dictionary header it's using it to read the column dictionary.