-
As we discussed in this PR, we should update and move [the Prometheus monitoring](https://github.com/kubeflow/training-operator/tree/master/docs/monitoring) docs to the [Kubeflow](https://www.kubeflow…
-
Hi! Is it possible to monitor "epoch-by-epoch" the training progress of a given algorithm (neural nets, particularly)?
Thanks a lot in advance
-
Hello!
Is there a plan to implement monitoring of the head training phase as well?
Currently it seems that logging (Wandb, neptune etc.) stops after the body training, but it can be useful to see th…
-
Hi,
Thank you for creating this really interesting project. I'm eager to use it in my work. One thing that would be really useful is if there was some way to easily examine training history to actu…
-
Monitor the following:
- Preactivation and activation (logits) of a module.
- Gradients of some parameters.
- Check the mean and std
- Check the histogram
- Percent parameter update.
-
A/C
- [ ] Proposal for how to split edx-platform pipeline ownership to be distributed to engineers and engineering managers
- [ ] Distribute said proposal
We want other squads to be able to hel…
-
There's already a tensorboard-marian connector. We can either plug into that or write our own version of it. We have the added benefit of having direct access to marian's stdout and stderr so we can j…
-
add functionality to
* monitor GPU utilisation
* stats about power consumption
-
Hi,
I want to monitor validation losses along with training loss and plot both losses to check if the model is over fitting on kitti dataset.
I request you to help me in this regard.
-
### Organization Name
Eyes4IT
### Main office location
Paris
### What regions of the world do you serve?
Europe
### Business description
We help companies (small to big) to scale their business…