dask / dask

Parallel computing with task scheduling
https://dask.org
BSD 3-Clause "New" or "Revised" License
12.57k stars 1.71k forks source link

DOC: from_array's use of tokenize #2930

Open jakirkham opened 6 years ago

jakirkham commented 6 years ago

Would be good to document from_array's use of tokenize and what that means in terms of calls to __dask_tokenize__ and normalize_token. In particular how do these influence the name used in graph construction.

ncclementi commented 3 years ago

@jakirkham checking-in here, would you say this issue was addressed with the note added here https://github.com/dask/dask/pull/6040/commits/0793e125a3cc7608a8096bfa1e41093ed7eaf635 or this is still not documented enough?

jakirkham commented 3 years ago

Not really since that PR is just showing how the name can be changed. The issue is really ask to document what happens when the name is unspecified or has a default value.

Edit: Put another way if I have some array-like object that I would like to pass to from_array so I can use it with Dask, how does Dask decide to construct a name for that object and how might implementing __dask_tokenize__ or similar influence that process?