When I begin to run the code, its cpu utilization is about 2000%, the training speed is acceptable. But with time going on, it falls to about 200% or 300%. And the GPU utilization is always low, not higher than 30%. So I think it runs mainly on cpu but not on gpu, and the former constrains the training speed. How to deal with this and accelerate the training speed?
When I begin to run the code, its cpu utilization is about 2000%, the training speed is acceptable. But with time going on, it falls to about 200% or 300%. And the GPU utilization is always low, not higher than 30%. So I think it runs mainly on cpu but not on gpu, and the former constrains the training speed. How to deal with this and accelerate the training speed?