astronomer / astro-sdk

Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.
https://astro-sdk-python.rtfd.io/
Apache License 2.0
335 stars 40 forks source link

issue with deserialization of timestamp datatype #1425

Open phanikumv opened 1 year ago

phanikumv commented 1 year ago

Describe the bug The below task fails with current version of code in main.Refer to https://astronomer.slack.com/archives/C03868KGF2Q/p1669809348854679

@aql.dataframe(columns_names_capitalization="original")
def load_and_group_covid_data():
    """
    Loads data from a COVID data REST API and then groups values based on the months.
    :return: A list of dataframes for each month of the pandemic
    """
    covid_df = _load_covid_data()
    covid_df["Date_YMD"] = covid_df["Date_YMD"].apply(lambda d: datetime.strptime(d, "%Y-%m-%d"))
    return [x for _, x in covid_df.groupby(covid_df.Date_YMD.dt.month)]
[2022-12-01, 07:20:39 UTC] {taskinstance.py:1772} ERROR - Task failed with exception
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/astro/sql/operators/dataframe.py", line 170, in execute
    function_output = self.python_callable(*self.op_args, **self.op_kwargs)
  File "/usr/local/airflow/dags/example_dataframe_api.py", line 51, in find_worst_covid_month
    covid_month = covid_month_data.Date_YMD.iloc[0].__format__("%Y-%m")
ValueError: Invalid format specifier

Version

To Reproduce Steps to reproduce the behavior:

  1. Write the DAG '...'
  2. Create connection '....'
  3. Run using '....'
  4. See error

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Additional context Add any other context about the problem here.

phanikumv commented 1 year ago

Need to test after #1590 is implemented

sunank200 commented 1 year ago

This needs to be tested