astronomer / apache-airflow-providers-transfers

https://apache-airflow-provider-transfers.rtfd.io/
Apache License 2.0
11 stars 3 forks source link

Add Dataframe dataset #63

Closed utkarsharma2 closed 1 year ago

utkarsharma2 commented 1 year ago

Description

What is the current behavior?

Currently, the dataframe dataset is not supported.

closes: https://github.com/astronomer/apache-airflow-provider-transfers/issues/18 https://github.com/astronomer/apache-airflow-provider-transfers/issues/21

What is the new behavior?

Added dataframe dataset.

Does this introduce a breaking change?

Nope

Checklist

utkarsharma2 commented 1 year ago

@sunank200 @phanikumv In addition to this thread - https://astronomer.slack.com/archives/C03868KGF2Q/p1682419076485949. I realized that we should only have an in-memory construct for Dataframe because if you consider transfers as mentioned below, they can be represented as existing transfers.

  1. Dataframe(dataset=Table()) to Dataframe(dataset=File()) -- which is same as existing Table() to File()
  2. Dataframe(dataset=File()) to Dataframe(dataset=Table()) -- which is same as existing File() to Table()
  3. Dataframe(dataset=File()) to Dataframe(dataset=File()) -- which is same as existing File() to File()
  4. Dataframe(dataset=Table()) to Dataframe(dataset=Table()) -- which is same as existing File() to File()

The only transfer that makes sense to me are:

  1. Dataframe(dataset=Table()) to Dataframe(dataset=pandas()) -- which will return an in-memory dataframe.
  2. Dataframe(dataset=pandas()) to Dataframe(dataset=Table()) -- which will transfer an in-memory data frame to a File same cases for the table to dataframe.
codecov-commenter commented 1 year ago

Codecov Report

Patch coverage: 86.05% and project coverage change: +2.14% :tada:

Comparison is base (2ad5ab6) 60.92% compared to head (d1a5213) 63.07%. Report is 3 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #63 +/- ## ========================================== + Coverage 60.92% 63.07% +2.14% ========================================== Files 40 41 +1 Lines 2380 2494 +114 Branches 228 236 +8 ========================================== + Hits 1450 1573 +123 + Misses 882 872 -10 - Partials 48 49 +1 ``` | [Flag](https://app.codecov.io/gh/astronomer/apache-airflow-providers-transfers/pull/63/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=astronomer) | Coverage Δ | | |---|---|---| | [UTO](https://app.codecov.io/gh/astronomer/apache-airflow-providers-transfers/pull/63/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=astronomer) | `63.07% <86.05%> (+2.14%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=astronomer#carryforward-flags-in-the-pull-request-comment) to find out more. | [Files Changed](https://app.codecov.io/gh/astronomer/apache-airflow-providers-transfers/pull/63?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=astronomer) | Coverage Δ | | |---|---|---| | [...rsal\_transfer\_operator/datasets/file/types/json.py](https://app.codecov.io/gh/astronomer/apache-airflow-providers-transfers/pull/63?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=astronomer#diff-c3JjL3VuaXZlcnNhbF90cmFuc2Zlcl9vcGVyYXRvci9kYXRhc2V0cy9maWxlL3R5cGVzL2pzb24ucHk=) | `66.66% <0.00%> (+5.55%)` | :arrow_up: | | [...l\_transfer\_operator/datasets/file/types/parquet.py](https://app.codecov.io/gh/astronomer/apache-airflow-providers-transfers/pull/63?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=astronomer#diff-c3JjL3VuaXZlcnNhbF90cmFuc2Zlcl9vcGVyYXRvci9kYXRhc2V0cy9maWxlL3R5cGVzL3BhcnF1ZXQucHk=) | `54.54% <0.00%> (+2.54%)` | :arrow_up: | | [...l\_transfer\_operator/universal\_transfer\_operator.py](https://app.codecov.io/gh/astronomer/apache-airflow-providers-transfers/pull/63?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=astronomer#diff-c3JjL3VuaXZlcnNhbF90cmFuc2Zlcl9vcGVyYXRvci91bml2ZXJzYWxfdHJhbnNmZXJfb3BlcmF0b3IucHk=) | `79.48% <50.00%> (ø)` | | | [...al\_transfer\_operator/datasets/file/types/ndjson.py](https://app.codecov.io/gh/astronomer/apache-airflow-providers-transfers/pull/63?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=astronomer#diff-c3JjL3VuaXZlcnNhbF90cmFuc2Zlcl9vcGVyYXRvci9kYXRhc2V0cy9maWxlL3R5cGVzL25kanNvbi5weQ==) | `38.88% <66.66%> (+2.04%)` | :arrow_up: | | [...erator/data\_providers/dataframe/Pandasdataframe.py](https://app.codecov.io/gh/astronomer/apache-airflow-providers-transfers/pull/63?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=astronomer#diff-c3JjL3VuaXZlcnNhbF90cmFuc2Zlcl9vcGVyYXRvci9kYXRhX3Byb3ZpZGVycy9kYXRhZnJhbWUvUGFuZGFzZGF0YWZyYW1lLnB5) | `52.54% <75.00%> (ø)` | | | [...transfer\_operator/data\_providers/dataframe/base.py](https://app.codecov.io/gh/astronomer/apache-airflow-providers-transfers/pull/63?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=astronomer#diff-c3JjL3VuaXZlcnNhbF90cmFuc2Zlcl9vcGVyYXRvci9kYXRhX3Byb3ZpZGVycy9kYXRhZnJhbWUvYmFzZS5weQ==) | `75.00% <75.00%> (ø)` | | | [...ransfer\_operator/data\_providers/filesystem/base.py](https://app.codecov.io/gh/astronomer/apache-airflow-providers-transfers/pull/63?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=astronomer#diff-c3JjL3VuaXZlcnNhbF90cmFuc2Zlcl9vcGVyYXRvci9kYXRhX3Byb3ZpZGVycy9maWxlc3lzdGVtL2Jhc2UucHk=) | `82.75% <75.00%> (-0.30%)` | :arrow_down: | | [...\_transfer\_operator/data\_providers/database/base.py](https://app.codecov.io/gh/astronomer/apache-airflow-providers-transfers/pull/63?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=astronomer#diff-c3JjL3VuaXZlcnNhbF90cmFuc2Zlcl9vcGVyYXRvci9kYXRhX3Byb3ZpZGVycy9kYXRhYmFzZS9iYXNlLnB5) | `79.06% <83.33%> (+0.46%)` | :arrow_up: | | [...ersal\_transfer\_operator/data\_providers/\_\_init\_\_.py](https://app.codecov.io/gh/astronomer/apache-airflow-providers-transfers/pull/63?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=astronomer#diff-c3JjL3VuaXZlcnNhbF90cmFuc2Zlcl9vcGVyYXRvci9kYXRhX3Byb3ZpZGVycy9fX2luaXRfXy5weQ==) | `96.87% <87.50%> (-3.13%)` | :arrow_down: | | [...perator/data\_providers/database/google/bigquery.py](https://app.codecov.io/gh/astronomer/apache-airflow-providers-transfers/pull/63?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=astronomer#diff-c3JjL3VuaXZlcnNhbF90cmFuc2Zlcl9vcGVyYXRvci9kYXRhX3Byb3ZpZGVycy9kYXRhYmFzZS9nb29nbGUvYmlncXVlcnkucHk=) | `83.50% <91.05%> (+11.34%)` | :arrow_up: | | ... and [4 more](https://app.codecov.io/gh/astronomer/apache-airflow-providers-transfers/pull/63?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=astronomer) | |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.