Add chunking logic in read method

astronomer / apache-airflow-providers-transfers

Apache License 2.0

11 stars 3 forks source link

Describe the bug A clear and concise description of what the bug is. I tried an 11 GB file (zip file of 11 GB) from S3 to GCS on a worker of 500 Mb and it got killed because of memory:

[2023-04-05, 21:03:34 UTC] {local_task_job.py:212} INFO - Task exited with return code Negsignal.SIGKILL

Expected behavior The read method should only load chunks into memory. Currently, if there are multiple files in a folder each file is loaded into memory. But for scenarios when a single file is very large, we should have a logic to load only chunks at once.

astronomer / apache-airflow-providers-transfers

Add chunking logic in read method #56