alan-turing-institute / grace

Graph Representation Analysis for Connected Embeddings
Other
34 stars 1 forks source link

[BUG]: Tensorboard struggles to save all the logging data per run #144

Closed KristinaUlicna closed 1 year ago

KristinaUlicna commented 1 year ago

Describe the bug

When GCN is trained for more than, say, 100 epochs, Tensorboard appears to save all the info but fails to render it correctly:

Screenshot 2023-08-22 at 13 55 38

⬆️ There is no figure to visualise the loss curves, for example...

Python version

No response

What OS are you using?

No response

Comments

Possibly linked to #97

KristinaUlicna commented 1 year ago

Intriguingly, this is the same run but its tf.events file was opened from:

VSCode launch option Tensorboard command line
Screenshot 2023-09-07 at 17 32 44 Screenshot 2023-09-07 at 17 32 55

As can be seen, the command line launch unwrap the entire run (full 500 epochs), so it seems better to use that option for future run monitoring. Instructions below 👇

https://github.com/alan-turing-institute/grace/blob/7964502b1fe929ba3831305b3f85aea876c6f4ac/grace/training/README.md?plain=1#L101-L106