itamarst / eliot

Eliot: the logging system that tells you *why* it happened
https://eliot.readthedocs.io
Apache License 2.0
1.1k stars 66 forks source link

Cannot add logging to Dask collections that contain futures #446

Closed mdwint closed 4 years ago

mdwint commented 4 years ago

I'm trying to add logging to a list of Dask collections, some of which may contain futures (if dask.persist() was previously called on them).

Replacing dask.compute(*args) with eliot.dask.compute_with_trace(*args) results in this error:

File "/code/.venv/lib/python3.7/site-packages/eliot/dask.py", line 74, in compute_with_trace
    optimized = optimize(*args, optimizations=[_add_logging])
File "/code/.venv/lib/python3.7/site-packages/dask/base.py", line 368, in optimize
    dsk = collections_to_dsk(collections, **kwargs)
File "/code/.venv/lib/python3.7/site-packages/dask/base.py", line 214, in collections_to_dsk
    groups = {k: (opt(dsk, keys), keys) for k, (dsk, keys) in groups.items()}
File "/code/.venv/lib/python3.7/site-packages/dask/base.py", line 214, in <dictcomp>
    groups = {k: (opt(dsk, keys), keys) for k, (dsk, keys) in groups.items()}
File "/code/.venv/lib/python3.7/site-packages/eliot/dask.py", line 115, in _add_logging
    func = dsk[key][0]
TypeError: 'Future' object is not subscriptable

It makes sense that this fails, since the calculation is already running in the background. Surely, logging would need to be added before the call to dask.persist().

Does anyone have any advice on how to achieve this?

itamarst commented 4 years ago

Hi,

Sorry it's not working for you. I've never really thought about supporting dask.persist() before. I will see about writing some tests and seeing what I can do.

My guess is that:

  1. Need to add clause to just skip Future instances.
  2. Would need to implement persist_with_trace which would implement the same code structure as eliot.dask.compute_with_trace does—pre-optimize with logging injected, then call dask.persist() on that.
itamarst commented 4 years ago

Please try the PR above and let me know if it works.