dask / distributed

A distributed task scheduler for Dask
https://distributed.dask.org
BSD 3-Clause "New" or "Revised" License
1.58k stars 718 forks source link

Visual suggestion for task graph #2711

Open birdsarah opened 5 years ago

birdsarah commented 5 years ago

The task status has a natural order: waiting -> processing -> memory -> released

The task graph could visually prompt this better:

1) Legend should be presented in that order (top to bottom waiting -> released) 2) Color scheme could better reflect this. Some options: a) Either a continuous hue with no particular meaning e.g. Viridis4 or Plasma4 b) waiting: gray + traffic lights (processing: red, memory: amber, released: green) (RdYlGn3 / Spectral3) c) waiting: gray + other three color combos that end in green (https://bokeh.pydata.org/en/latest/docs/reference/palettes.html)

2b is probably my favorite

I think b and c are probably better than a, but the problem with them is that they will break the meaning of the current green changing it from processing to released. But I think it might be worth it because green is the color we are programmed to look for to mean "done" not "in the middle of things". Depending on opinions on APIs though maybe that means a change like that should wait till a big version bump.

TomAugspurger commented 5 years ago

cc @mathdugre, if you're interested in this one too.

I think always having the legend present with all the possible states makes sense.

I also like the proposal for progressing gray -> red -> amber -> green easier to remember. I don't recall what this page does on error (black maybe?)

mathdugre commented 5 years ago

I can take care of it over the weekend

TomAugspurger commented 5 years ago

Great, thanks! May want to solicit a bit more feedback about whether we are all OK with changing the meaning of the colors before diving into it.

On Tue, May 28, 2019 at 8:13 AM Mathieu Dugré notifications@github.com wrote:

I can take care of it over the weekend

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/dask/distributed/issues/2711?email_source=notifications&email_token=AAKAOIWSRZYFOSFGRPBQMN3PXUVWRA5CNFSM4HN3EBW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWMCNHQ#issuecomment-496510622, or mute the thread https://github.com/notifications/unsubscribe-auth/AAKAOIRSSZCQWCAFQE6NMFLPXUVWRANCNFSM4HN3EBWQ .

mrocklin commented 5 years ago

+1 from me. I generally trust @birdsarah on visual design :)

On Tue, May 28, 2019 at 8:43 AM Tom Augspurger notifications@github.com wrote:

Great, thanks! May want to solicit a bit more feedback about whether we are all OK with changing the meaning of the colors before diving into it.

On Tue, May 28, 2019 at 8:13 AM Mathieu Dugré notifications@github.com wrote:

I can take care of it over the weekend

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/dask/distributed/issues/2711?email_source=notifications&email_token=AAKAOIWSRZYFOSFGRPBQMN3PXUVWRA5CNFSM4HN3EBW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWMCNHQ#issuecomment-496510622 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AAKAOIRSSZCQWCAFQE6NMFLPXUVWRANCNFSM4HN3EBWQ

.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/dask/distributed/issues/2711?email_source=notifications&email_token=AACKZTBYRI5NPFVFISHQX23PXUZGPA5CNFSM4HN3EBW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWMFFZY#issuecomment-496521959, or mute the thread https://github.com/notifications/unsubscribe-auth/AACKZTFUUAOMH4IXY5OOIITPXUZGPANCNFSM4HN3EBWQ .

mathdugre commented 5 years ago

I like the color we have right now for processing (green). I think red might be a bit counterintuitive I feel like red could be interpreted as stop. What do you think @birdsarah ? Maybe gray->green->amber->red would be more intuitive

birdsarah commented 5 years ago

I agree. Red is often a stop indicator. (also danger, hazard).

I think this means it's too ambiguous for "processing" (my proposal) or "released" (@mathdugre proposal).

Green is often associated with "success."

I'm happy to accept that "success" and "released" (my proposal) are not quite the same. But I would argue strongly that "success" is very different from "processing" which is what I don't like about the current schema and what motivated this issue.

As I think about it more both "in memory" and "released" are types of success which adds to the confusion.

This all makes me want to revise my original favorite option and steer away from the overloaded colors of red and green.

Here's my new proposal. White with gray border for waiting, followed by increasing saturation of the dask orange as tasks progress along the pipeline. This will be color blind friendly too. The reason to not just do all grayscale is that with thousands of tasks the gray border will dominate the "waiting" state.

rect3742-7

(note this picture isn't exactly the dask orange, but that would be the idea)

birdsarah commented 5 years ago

@TomAugspurger thank you for promoting some discussion.

mathdugre commented 5 years ago

Sorry for the late reply.. I really like this this idea :)

mathdugre commented 5 years ago

Does anyone have feedbacks on the latest idea from birdsarah? @mrocklin @TomAugspurger

I could work on it this weekend.

mrocklin commented 5 years ago

As I think about it more both "in memory" and "released" are types of success which adds to the confusion.

They mean starkly different things in other senses though. For me I like to see the frontier of computation and the frontier of storage. To me this smooth gradient approach makes these frontiers less distinct.

birdsarah commented 5 years ago

@mrocklin can you make a concrete alternative suggestion of provide additional clarity. i could work on this if we can unblock it.

mrocklin commented 5 years ago

I don't have a concrete suggestion. Let's chat about this next time we have a moment though. I think that we should treat this value more as a categorical than continuous quantity. My guess is that we won't want a linear color ramp.

birdsarah commented 5 years ago

Is there any other order in which a task can go through the pipeline than Waiting -> Processing -> In memory -> Released?

mrocklin commented 5 years ago

There are a variety of situations that can cause transitions from any of those states to any others, but they're rare.

But mostly, each state has very different ramifications on performance. For example, only memory tasks take up a bunch of memory. That's qualitatively quite different from all the others. We often don't care as much about the progression as we care about "what are all the tasks in state X"

On Wed, Aug 21, 2019, 9:11 PM Sarah Bird notifications@github.com wrote:

Is there any other order in which a task can go through the pipeline than Waiting -> Processing -> In memory -> Released?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dask/distributed/issues/2711?email_source=notifications&email_token=AACKZTGEGNLYMJPGPECYRS3QFYGXHA5CNFSM4HN3EBW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD432B3Y#issuecomment-523739375, or mute the thread https://github.com/notifications/unsubscribe-auth/AACKZTHSYUONBY56EBTELI3QFYGXHANCNFSM4HN3EBWQ .