materialsvirtuallab / matgl

Graph deep learning library for materials
BSD 3-Clause "New" or "Revised" License
232 stars 57 forks source link

[Bug]: GPU utilization low and multi-core CPU utilization only around 100% during training #254

Closed hchenglab closed 1 month ago

hchenglab commented 2 months ago

Email (Optional)

hcheng.research@gmail.com

Version

v1.0.0

Which OS(es) are you using?

What happened?

I'm doing GPU training on the M3GNET model and the GPU utilization fluctuates a lot and is very low. The bottleneck seems to be the CPU part, as the CPU is only utilized about 100% during training (the CPU has 64 cores, so the max utilization should be 6400%). Is this normal for M3GNET training?

Code snippet

No response

Log output

No response

Code of Conduct

kenko911 commented 1 month ago

Hi @hchenglab, I am not aware of any low GPU usage during the training. I also don't think that the CPU usages matter the GPU training a lot.