Open chavinlo opened 1 year ago
Hi @chavinlo!
Thanks for the feature request! We're busy with another project built on top of hivemind at the moment, but will consider adding this feature when we get back to working on the hivemind core.
Currently, we just use external utilities like nvtop
to monitor the traffic (input/output in Mbps). It doesn't display the progress of the gradient sync directly, but one usually can get a sense of it by watching the spikes in traffic (e.g., if previous gradient syncs took 30 sec and the current spike lasts for 15 sec, you know that the sync progress is 50%).
I agree though that an explicit way to watch the progress would be much more convenient.
Would the desire be to monitor at the p2pd layer, or at the Python layer?
libp2p has some metrics built in: https://github.com/libp2p/go-libp2p/tree/master/core/metrics
But it might be easier to just do it inside the Python DHT class?
We are building this feature into our platform and can send it to you when it is ready?
Hi @Bandcompute01, sure, that would be awesome! If you are building this on top of hivemind, it would even better to integrate this into the repository with a pull request: if you're interested, I am happy to assist with that
Is your feature request related to a problem? Please describe. No.
Describe the solution you'd like Basically display 2 stats: 1.- Network Input/Output in mbps 2.- Live count of gradient % sync
Describe alternatives you've considered At the moment I plan on just using an external tracker but it would be nice to have this embedded
Additional context I'm using pytorch lightning as I had an issue with the native version of hivemind (#519 )