Open anirudh2290 opened 5 years ago
Hey, this is the MXNet Label Bot. Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it. Here are my recommended labels: Feature
Thanks for this feature idea. Interested to work. Will post a PR once I get this working.
Thanks @ChaiBapchya . I think it requires some additional testing. Also, need to do some sanity performance tests. Need to add a switch to toggle it in the profiler API. Also, it would be nice to test it in distributed training setting though not strictly required.
The system team from UofT developed https://github.com/tbd-ai/tbd-tools which profiles memory footprint http://www.sysml.cc/doc/2019/demo_24.pdf for MXNet @SerailHydra @olympian94 @izaakniksan @ArmageddonKnight If possible we can reuse and avoid duplicating work
Thanks @eric-haibin-lin for the pointer! Will take a look
Hi, all
Thanks for your interests in our memory tools! I started to build this tool for benchmarking purpose and the version is only 0.11.0. Later my colleagues use the same techniques to build it on new versions of MXNet for optimization purpose.
The open-sourced one is on a bit old version, I am not sure how helpful it is since the codebase changed a lot. I think @ArmageddonKnight has the memory profiling tool for a newer version. If you need some input from us in person, we would be happy to help have the tool integrated to main branch.
I have created prototype for visualizing the memory pools on the gpu. I have added a doc explaning the feature and how to use the prototype in the cwiki: https://cwiki.apache.org/confluence/display/MXNET/MXNet+Memory+Profiling+Enhancements
I would need some help making this prototype ready to be PR'ed.
There are more improvements that can be done as mentioned in the cwiki. Listing some of them here:
Let me know if interested.