Closed jordaneliastam closed 1 year ago
It's related to this https://stackoverflow.com/questions/38821132/bokeh-valueerror-out-of-range-float-values-are-not-json-compliant
Quick workaround is to cleanup data to not contain not a number values - NaN
, Infinity
.
In VS Code extension we utilize the JSON5 lib to parse JS-compatible JSONs with some additional stuff like NaN, etc.
I wonder if we could change the workload type to text and use on the backed something like JSON5 to parse it, to make it more lax. It might have implication down the road (e.g. the way we serialize things into DB - does it support NaNs, etc)? The way we return the results to FE @amritghimire do you know if we already support this a bit more lax format? E.g. what happens if there is a repo with a NaN in plots data?
Even if we decide to keep it strict, we should probably then detect early and have a proper message for this.
Also, @daavoo for visibility.
Also, @daavoo for visibility.
As a quick solution, I am going to patch it on the DVCLive side by casting the invalid values to string
I was logging my gradient norms, and hadn't seen any issues in the dvclive plots, but after putting a torch.isfinite()
guard I saw they were return inf
occasionally. So, just confirming your assessment of the issue!
thanks!
Every time I run
dvc exp run
, eventually I start to see the following warning:WARNING:dvc_studio_client.post_live_metrics:Failed to post to Studio: Out of range float values are not JSON compliant
dvclive/studio never seems to recover and the data on studio stops updating. I am able to view the data in the DVC vscode extension however.
I am using a slurm cluster.