dask / dask-expr

BSD 3-Clause "New" or "Revised" License
79 stars 18 forks source link

__dask_keys__ and future identifiers are different #952

Open milesgranger opened 3 months ago

milesgranger commented 3 months ago

When trying to fix dask/distributed test_client.py::test_futures_of_sorted

FAILED distributed/tests/test_client.py::test_futures_of_sorted - assert "('make-f63ce8128ed27f9b78bb64e3022a83de', 0)" in "<Future: finished, type: pandas.core.frame.DataFrame, key: ('make-_to_string_dtype-3bf19a79288c09d4910e4b935d098e2f', 0)>"

  • where "('make-f63ce8128ed27f9b78bb64e3022a83de', 0)" = str(('make-f63ce8128ed27f9b78bb64e3022a83de', 0))
  • and "<Future: finished, type: pandas.core.frame.DataFrame, key: ('make-_to_string_dtype-3bf19a79288c09d4910e4b935d098e2f', 0)>" = str(<Future: finished, type: pandas.core.frame.DataFrame, key: ('make-_to_string_dtype-3bf19a79288c09d4910e4b935d098e2f', 0)>)

They appear to be in order, but not the same identifiers.

Full comparison ``` (Pdb) for k, f in zip(df.__dask_keys__(), futures): print(f"{k}: {f}") ('make-4e59f4c2ced9ebff091a04c75ccf9098', 0): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 1): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 2): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 3): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 4): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 5): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 6): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 7): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 8): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 9): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 10): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 11): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 12): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 13): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 14): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 15): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 16): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 17): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 18): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 19): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 20): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 21): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 22): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 23): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 24): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 25): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 26): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 27): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 28): ('make-4e59f4c2ced9ebff091a04c75ccf9098', 29): ```

xref: https://github.com/dask/distributed/pull/8560

phofl commented 3 months ago

This was changed on purpose

cc @fjetter IIRC you added this so that we would have unique keys. #798 for reference and this comment specifically: https://github.com/dask-contrib/dask-expr/pull/798/files#r1476225716