Open jart opened 7 years ago
We now attempt to reload the backend 5s after the previous reload finishes: https://github.com/tensorflow/tensorboard/blob/2d7d62a13c30fe59967e583c696aae55f1e823e4/tensorboard/main.py#L71
I don't see how we can do significantly better than that, so I think this can be closed barring other compelling reasons.
Chatted offline. I think we can actually have real time synchronization between the backend and frontend using long polling. Better yet, we can have a websocket open with the backend, and then we can show actual progress bars as the thing loads event files. That would be sweet.
I'm trying to gauge what we can expect from tensorboard speed. I'm finding that the reading in of data is incredibly slow, i'm running tensorflow in google cloud machine learning engine, and then just pointing to the logdir from within the google cloud shell. I'd say that it takes 10 minutes to load the event data. The model has already run. Is this the same issue, should I be manually refreshing to force the frontend to replot?
How big are your event logs and are they stored on GCS?
I have identified that the problem is specifically with running from google cloud shell. If I download the whole dir, and visualize locally, it happens in seconds. It was natural to use tensorboard within that environment and not want to grab the whole model dir (200mb) and copy locally.
I am guessing we are running into an issue where on GCS we are loading the events without readahead chunks, so we are doing round trip requests incredibly frequently, maybe as often as for every individual event. This makes for horrible performance. Filing a new issue for this here: https://github.com/tensorflow/tensorboard/issues/158 (since this thread is more about streaming / synchronization)
It would be ideal if the graph window could give instantaneous refresh.
For pyTorch I built a tool that outputs the torch graph into a PDF. I use the skim PDF viewer to watch the pdf. This way I can insert a graph(variable_a)
command for whichever partial graph I want to take a close look, and the feedback is instantaneous (less than 200 ms).
When I first saw tensorboard I thought this was the key feature. There must be a lot of people who have the same expectation.
The magic occurs when it is so fast that you can use this to interactively program your model.
Technically, the challenge is to build the event handling such that it is both
can we do this via something like @throttle(rising=true, delay=3000)
?
@s-gv said in https://github.com/tensorflow/tensorflow/issues/2050 a year ago with 👍x6: