earthobservations / wetterdienst

Open weather data for humans.
https://wetterdienst.readthedocs.io/
MIT License
349 stars 54 forks source link

Correct typecasting for QN and INDICATOR fields with missing values #174

Closed amotl closed 3 years ago

amotl commented 3 years ago

Dear Benjamin,

just a minor question. Can you tell me why coerce_field_types() will yield QN and RS_IND_01 as pd.Series types? Wouldn't a basic list type also be fine here?

https://github.com/earthobservations/wetterdienst/blob/624b56ebd6e90f2da14fc3b992048b94e6530cfe/tests/additionals/test_functions.py#L53-L54

With kind regards, Andreas.

gutzbenj commented 3 years ago

I think this was related to pandas, which for one of the Python versions threw an error. This error had indicated that pandas had changed the integer type for one of the Python versions.

This part from coerce_field_types()

            df[column] = df[column].astype(int)
        elif column in DATE_FIELDS_REGULAR:
            df[column] = pd.to_datetime(
                df[column],
                format=TIME_RESOLUTION_TO_DATETIME_FORMAT_MAPPING[time_resolution],
            )
        elif column in DATE_FIELDS_IRREGULAR:
            df[column] = pd.to_datetime(
                df[column], format=DatetimeFormat.YMDH_COLUMN_M.value
            )
        elif column in QUALITY_FIELDS or column in INTEGER_FIELDS:
            df.loc[column_value_index, column] = df.loc[
                column_value_index, column
            ].astype(int)

confuses the integer types.

However the whole thing really messes up when creating the temporary DataFrame for the test... Maybe it is more reasonable to test against a dictionary via DataFrame.tojson()