Updates to Pathmind webapp to support PBT

PathmindAI / pathmind-webapp

The Pathmind Webapp

https://dev.devpathmind.com

Apache License 2.0

1 stars 0 forks source link

Updates to Pathmind webapp to support PBT #744

Closed ejunprung closed 4 years ago

ejunprung commented 4 years ago

Mean Reward Graph

PBT executes one continuous run that evolves automatically over time.

Therefore, we don't need the grid below the mean reward graph.

However, we may want to consider explaining what PBT is learning in some way. An example could be a user mousing over the mean reward chart to trace perturbation history.

Policy Export

The output is now a single policy. Need to remove logic for exporting one policy for each grid search trial.

Update the right column on the experiment page

These aren't really necessary anymore. I'd only care to see the reward function and time elapsed.

As for reward scores, I prefer to see it clearly on the graph because its change over time (i.e. shape) is more important than the static number that we provide. Perhaps we can reintroduce this using simulation metrics later.

slinlee commented 4 years ago

This is great info @ejunprung . It matches with the direction we're going with #639. I'll break this down into to-do items.

ejunprung commented 4 years ago

Okay, can probably delete the "reward" and "algorithm" boxes as well. Doesn't really provide much value.

thetwotravelers commented 4 years ago

I agree with getting rid of algorithm box. Until we have algorithms other than PPO, this information might only trigger questions about trying other algorithms from the user - which we would fall short of giving a great response.

The reward score box is redundant with the graph. It should be visible from the graph if we choose to include the graph.

thetwotravelers commented 4 years ago

Things I think users would care to be assured of in the dashboard:

Using latest AI technology (deep reinforcement learning)
Many different trials performed (as shown by PBT graph)
Many episodes considered during learning (we should consider making this visible, it's more meaningful to users than "training iteration")
Generally positive slope and converging of separate trials (as show by graph)
Simulation metrics (this we are lacking)

slinlee commented 4 years ago

@kepricon is integrating PBT with the training and updating parts.

Here is a list of UI related changes, split out from here: https://github.com/SkymindIO/pathmind-webapp/issues?q=is%3Aopen+is%3Aissue+label%3Apbt