Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.
350
stars
43
forks
source link
load_file() - Complete Parquet file is loaded in Memory #1059
Open
utkarsharma2 opened 2 years ago
Describe the bug When inferring schema complete Parquet is loaded in memory. https://github.com/astronomer/astro-sdk/blob/e4111397387e43ba7c8d2c0f5645376c466d271a/python-sdk/src/astro/files/types/parquet.py#L45
Version
Expected behavior Research - if there is a better way to read schema information from parquet instead of not loading them in memory.
Additional context Add any other context about the problem here.