Open maozguttman opened 3 years ago
Hi Maoz,
thanks for the suggestion. TBH i doubt this will work faster. This would require using virtual functions which implies extra memory access (to vtable). And one extra function call. Which may be significant for such trivial work, that read_primitive_type
does.
Anyway this requires benchmarking. Unfortunately i currently don't have much time as there are higher priority tasks. Maybe later.
Hi,
I'm using pguint package (can be found in github) for supporting unsigned integers in postgres. pguint package adds following postgres data types:
I modified parquet_fdw code to support these new datatypes.
My proposal is to make TypeInfo to behave more as a C++ class than a C structure. It eliminates the C switches on parquet datatypes (maybe will work a little bit faster) and makes it easier to add support for new datatypes. TypeInfo will be a base class. There will be derived classes from it for each parquet data type, for example: BoolTypeInfo, Int8TypeInfo, DoubleTypeInfo, TimestampTypeInfo, etc. TypeInfo base class will have (virtual) interface methods that are implemented in the derived classes.
Example on read_primitive_type function: Add following interface:
Implement it in derived classes:
And modify ParquetReader::read_primitive_type accordingly:
A "special" code that does not seem to fit above proposal is row_group_matches_filter function where it calls bytes_to_postgres_type function since there is no TypeInfo involved there (maybe it can be changed).
Thanks, Maoz