itamarst / eliot

Eliot: the logging system that tells you *why* it happened
https://eliot.readthedocs.io
Apache License 2.0
1.1k stars 66 forks source link

Latest dask (2.1.0) breaks _add_logging compute features #423

Closed dmyung closed 5 years ago

dmyung commented 5 years ago

When starting up a dask job and wish to use eliot.dask.compute_with_trace() the job, the compute job initialization will fail to start.

# File eliot/dask.py

Line 130
>       assert result.keys() == dsk.keys()
E       AssertionError

In dask 1.2.1, the contents of the result and job keys were a list of tuples tests/test_dask.py dict_keys([('from_sequence-ff4946f0379cbe77abc9f4e8013168b7', 0), ('lambda-e084aa13e4280bf986532e8292bfdff4', 0), ...

Whereas in dask 2.1.0, they're just a list of strings: dict_keys(['combine_results-388d2de3-0346-4a09-a5b9-2e14d3a5f782', 'output_results-db913fd4-ce52-4f8b-8cbd-95cbde3915bb',...])

So the equality check no longer works.

Simple fix is to have the assert do a set() operation on both sides of the == sign, but it is curious why the keys themselves report themselves now as tuples instead of strings because it's not apparent where dask changed under its covers to cause the graph to change structure so much.

itamarst commented 5 years ago

Conda-Forge now has Eliot 1.10.0 which fixes this.