fossasia / visdom

A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Torch and Numpy.
Apache License 2.0
10.03k stars 1.13k forks source link

Add `store_history` feature for video panes #438

Open alexsax opened 6 years ago

alexsax commented 6 years ago

The store_history feature in #393 is a great feature. Logging RL training often requires storying episode recordings, too. These videos can clutter up the dashboard. Would it be possible to add store_histrory for videos, too?

JackUrb commented 6 years ago

Storing video history could also work - I'd be concerned about the memory cost associated with doing it, but it wouldn't be difficult to implement the support. #393 will be in following #395 being resolved, so adding it to videos would come sometime after that.

alexsax commented 6 years ago

Right now I find that I often just log the videos separately. I'm not very familiar with the visdom implementation--would store_history be more expensive than what I do right now?

Another solution for this use-case would be to programmatically tile the videos from python. Like in the example here. This might already be possible, I'm not sure! You might prefer this solution because it makes the memory cost a little more explicit.

JackUrb commented 6 years ago

It wouldn't be more expensive, but it introduces important design decisions that could be hard to implement (what happens when you scroll through the history while playing a video, is it expected that all of the videos line up?)

Tiling from the visdom python client should already be theoretically possible if you provide the height and width options, as the rest would fill in.

The simplest implementation of storing video history would just be essentially compiling all of the tabs into one where a slider determines which video you're looking at, but no video time is stored between them.

alexsax commented 6 years ago

I think that I see what you mean--do the videos line up, both temporally and in terms of spatial size on screen. In terms of spatial size, you face the same design decision with images.

Tiling is possible, as you suggest--but right now it requires a dedicated environment. So to tile videos during a hyperparameter search, I'd need to have 2 environments per setting. One for videos, and one for other results. It's doable but a little inconvenient.

The simplest implementation is what I had in mind. I think that is sufficient for people doing RL. Storing time might be important if you're working with video (e.g. visual grounding or action recognition). But the simple case should work for 95% of people (since right now the # of people doing RL > # doing video)