Closed NickCrews closed 6 months ago
This looks to be caused by pandas==2.2.0
, which breaks a bunch of timestamp related functionality. We have a bot PR (#8056) that I am slowly working through to try to get pandas 2.2.0 working
It's getting more and more difficult for us to preserve compatibility with pandas 1.x, and apparently even between 2.1 and 2.2 there were some disruptive changes.
It seems like there was a bunch of churn in supported datetime64 units:
datetime64[ns]
supporteddatetime64[D]
supporteddatetime64[D]
no longer supportedI'm not sure how to create a compatibility layer for that off the top of my head.
Here's the commit SHA for the 210th commit since the last release, which should correspond to the prerelease build you have from PyPI:
~/g/i/ibis git rev-list 7.2.0..HEAD --count 14 12:33
210
~/g/i/ibis git rev-parse --short HEAD 15 12:33
0f4366743
@gforsyth ahh, thanks for the explanation of how those prerelease numbers work! Now in the future I can find the exact SHA myself. PS, would it be possible to include the SHA into the build, eg ibis.__sha__
or something? Not sure if there is some convention around that, or if there is already an extension for build tooling that does this.
@cpcloud I noticed how I had pandas 2.2.x in my environment, and you had 2.1.x in ibis, but I dismissed that as a cause because semantic versioning says it shouldnt' break. Buttttttt, we all know how much to trust semantic versioning 😉 I python -m pip install pandas==2.1.4
in my app's environment, and no error. Thanks for the quick unblocked!
2.2.x: datetime64[D] no longer supported
Do you know if this was an explicit choice? or a mistake they accidentally left this one out of that conversion/verification logic? I would love to see their reasoning for not supporting it, I would really like them to support it, how else are we supposed to represent dates in pandas?
2.2.x: datetime64[D] no longer supported
FWIW: this affected me too in https://github.com/googleapis/python-bigquery-dataframes/pull/492
___________________ test_remote_function_stringify_with_ibis ___________________ [gw1] linux -- Python 3.11.6 /tmpfs/src/github/python-bigquery-dataframes/.nox/e2e/bin/python session =scalars_table_id = 'bigframes-load-testing.bigframes_testing.scalars_269e578a0cb35c2ee0eedfef3d91d3fc' ibis_client = dataset_id = 'bigframes-load-testing.bigframes_tests_system_20240322001149_109284_dataset_id' bq_cf_connection = 'bigframes-rf-conn' @pytest.mark.flaky(retries=2, delay=120) def test_remote_function_stringify_with_ibis( session, scalars_table_id, ibis_client, dataset_id, bq_cf_connection, ): try: @session.remote_function( [int], str, dataset_id, bq_cf_connection, reuse=False, ) def stringify(x): return f"I got {x}" project_id, dataset_name, table_name = scalars_table_id.split(".") if not ibis_client.dataset: ibis_client.dataset = dataset_name col_name = "int64_col" table = ibis_client.tables[table_name] table = table.filter(table[col_name].notnull()).order_by("rowindex").head(10) > pandas_df_orig = table.execute() [tests/system/large/test_remote_function.py:197](https://cs.corp.google.com/piper///depot/google3/tests/system/large/test_remote_function.py?l=197): _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ [.nox/e2e/lib/python3.11/site-packages/ibis/expr/types/core.py:324](https://cs.corp.google.com/piper///depot/google3/.nox/e2e/lib/python3.11/site-packages/ibis/expr/types/core.py?l=324): in execute return self._find_backend(use_default=True).execute( [.nox/e2e/lib/python3.11/site-packages/ibis/backends/bigquery/__init__.py:698](https://cs.corp.google.com/piper///depot/google3/.nox/e2e/lib/python3.11/site-packages/ibis/backends/bigquery/__init__.py?l=698): in execute result = self.fetch_from_cursor(cursor, expr.as_table().schema()) [.nox/e2e/lib/python3.11/site-packages/ibis/backends/bigquery/__init__.py:707](https://cs.corp.google.com/piper///depot/google3/.nox/e2e/lib/python3.11/site-packages/ibis/backends/bigquery/__init__.py?l=707): in fetch_from_cursor return PandasData.convert_table(df, schema) [.nox/e2e/lib/python3.11/site-packages/ibis/formats/pandas.py:118](https://cs.corp.google.com/piper///depot/google3/.nox/e2e/lib/python3.11/site-packages/ibis/formats/pandas.py?l=118): in convert_table df[name] = cls.convert_column(series, dtype) [.nox/e2e/lib/python3.11/site-packages/ibis/formats/pandas.py:135](https://cs.corp.google.com/piper///depot/google3/.nox/e2e/lib/python3.11/site-packages/ibis/formats/pandas.py?l=135): in convert_column result = convert_method(obj, dtype, pandas_type) [.nox/e2e/lib/python3.11/site-packages/ibis/formats/pandas.py:201](https://cs.corp.google.com/piper///depot/google3/.nox/e2e/lib/python3.11/site-packages/ibis/formats/pandas.py?l=201): in convert_Date return s.astype(pandas_type).dt.date [.nox/e2e/lib/python3.11/site-packages/pandas/core/generic.py:6640](https://cs.corp.google.com/piper///depot/google3/.nox/e2e/lib/python3.11/site-packages/pandas/core/generic.py?l=6640): in astype new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors) [.nox/e2e/lib/python3.11/site-packages/pandas/core/internals/managers.py:430](https://cs.corp.google.com/piper///depot/google3/.nox/e2e/lib/python3.11/site-packages/pandas/core/internals/managers.py?l=430): in astype return self.apply( [.nox/e2e/lib/python3.11/site-packages/pandas/core/internals/managers.py:363](https://cs.corp.google.com/piper///depot/google3/.nox/e2e/lib/python3.11/site-packages/pandas/core/internals/managers.py?l=363): in apply applied = getattr(b, f)(**kwargs) [.nox/e2e/lib/python3.11/site-packages/pandas/core/internals/blocks.py:758](https://cs.corp.google.com/piper///depot/google3/.nox/e2e/lib/python3.11/site-packages/pandas/core/internals/blocks.py?l=758): in astype new_values = astype_array_safe(values, dtype, copy=copy, errors=errors) [.nox/e2e/lib/python3.11/site-packages/pandas/core/dtypes/astype.py:237](https://cs.corp.google.com/piper///depot/google3/.nox/e2e/lib/python3.11/site-packages/pandas/core/dtypes/astype.py?l=237): in astype_array_safe new_values = astype_array(values, dtype, copy=copy) [.nox/e2e/lib/python3.11/site-packages/pandas/core/dtypes/astype.py:182](https://cs.corp.google.com/piper///depot/google3/.nox/e2e/lib/python3.11/site-packages/pandas/core/dtypes/astype.py?l=182): in astype_array values = _astype_nansafe(values, dtype, copy=copy) [.nox/e2e/lib/python3.11/site-packages/pandas/core/dtypes/astype.py:110](https://cs.corp.google.com/piper///depot/google3/.nox/e2e/lib/python3.11/site-packages/pandas/core/dtypes/astype.py?l=110): in _astype_nansafe dta = DatetimeArray._from_sequence(arr, dtype=dtype) [.nox/e2e/lib/python3.11/site-packages/pandas/core/arrays/datetimes.py:327](https://cs.corp.google.com/piper///depot/google3/.nox/e2e/lib/python3.11/site-packages/pandas/core/arrays/datetimes.py?l=327): in _from_sequence return cls._from_sequence_not_strict(scalars, dtype=dtype, copy=copy) [.nox/e2e/lib/python3.11/site-packages/pandas/core/arrays/datetimes.py:354](https://cs.corp.google.com/piper///depot/google3/.nox/e2e/lib/python3.11/site-packages/pandas/core/arrays/datetimes.py?l=354): in _from_sequence_not_strict dtype = _validate_dt64_dtype(dtype) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ dtype = dtype(' raise ValueError( f"Unexpected value for 'dtype': '{dtype}'. " "Must be 'datetime64[s]', 'datetime64[ms]', 'datetime64[us]', " "'datetime64[ns]' or DatetimeTZDtype'." ) E ValueError: Unexpected value for 'dtype': 'datetime64[D]'. Must be 'datetime64[s]', 'datetime64[ms]', 'datetime64[us]', 'datetime64[ns]' or DatetimeTZDtype'. [.nox/e2e/lib/python3.11/site-packages/pandas/core/arrays/datetimes.py:2550](https://cs.corp.google.com/piper///depot/google3/.nox/e2e/lib/python3.11/site-packages/pandas/core/arrays/datetimes.py?l=2550): ValueError =============================== warnings summary ===============================
I'm working around it by replacing table.execute()
with sql = table.compile() ; pandas_df_orig = bigquery_client.query(sql).to_dataframe()
which does the conversion to pandas in a different way.
Fixed by #8758.
What happened?
In my app I am running into this. On ibis main, I can't repro. I'm guessing this has something to do with the combo of other libraries I have installed, eg pandas and duckdb. I figure this is worth pointing out to you because others might also have this incompatible version of a 3rd party lib installed, so it would be great if
Do you have any tips on what you think the cause could be? what libs I should start bisecting to try to pin down the cause?
What version of ibis are you using?
my app using 8.0.0.dev210, released Jan 27, has this bug.
In the ibis repo, I can't repro using commits from jan 27th. Is there a way to see the exact commit SHA that went into 8.0.0.dev210 on pypi?
Full output of
pip freeze
in my app:What backend(s) are you using, if any?
duckdb and pandas
Relevant log output
No response
Code of Conduct