nkolot / GraphCMR

Repository for the paper "Convolutional Mesh Regression for Single-Image Human Shape Reconstruction"
BSD 3-Clause "New" or "Revised" License
425 stars 67 forks source link

cpu occupacy is very high #20

Closed Maqingyang closed 5 years ago

Maqingyang commented 5 years ago

When I ran either train or eval code, the cpu occupacy is very high, and it's very hard to run multi-task on a multi-gpu machine, due to the cpu computing constraint. Did you encounter this problem in your training? Or you could give me some hints about which part of your implementation may be cpu exhaustive. Thank you very much! For example, I show some img about cpu occupaty. My cpu core is intel i9-9900K. For simlilarity, I ran the code of evaluation, i.e. eval.py. one task image multi task on a two-gpu machine: image

Maqingyang commented 5 years ago

It seems that high cpu occupacy lies in the dataloader, because if I use a constant input_batch instead of dataloader, it won't occupy so much cpu.

image

input_batch.json contains one pre-stored batch from dataloader.

nkolot commented 5 years ago

Maxing out the CPU in general is a good thing, because it means that the input pipeline is efficient. You can reduce the num_workers option to use less processes if this is an issue on your system. But consequently this will reduce the training speed.

Maqingyang commented 5 years ago

Thanks for your advice. I am carefully looking for the cpu exhaustive operation. I found that the crop() in imutils.py is very cpu exhaustive. In fact, if I did't use crop in base_dataset.py, the cpu occupacy will drop to 1/4 as before! That's very strange. I am checking more on that. I don't know what about on your machine.

Maqingyang commented 5 years ago

I located the problem in the imutils.py. In func transfrom():

    if invert:
        t = np.linalg.inv(t)

This inverse operation is very cpu exhaustive. Could this operation be avoided? Or move this operation to be computed on gpu?

Maqingyang commented 5 years ago

I found a simple solution, which is magically effective.

    if invert:
        # t = np.linalg.inv(t)
        t_torch = torch.from_numpy(t)
        t_torch = torch.inverse(t_torch)
        t = t_torch.numpy()

If anyone have the same problem, maybe can have a try.

nkolot commented 5 years ago

I will look into it. Ii'm surprised that numpy is so slow in this simple task of inverting a 4x4 matrix.