MusicGen on A100/A10G/3090 is Single Core CPU Bound

facebookresearch / audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

MIT License

20.86k stars 2.13k forks source link

MusicGen on A100/A10G/3090 is Single Core CPU Bound #192

Open zaptrem opened 1 year ago

zaptrem commented 1 year ago

Even with a batch size of one I'm getting results like this across the board and identical inference time between an A100/A10G/3090 on large and medium across 1-4 batch sizes.

Is this something that can be fixed on my end? If not, what's the cause?

iAlborz commented 1 year ago

same issue on M1 Max. Only one core is being used.

carlthome commented 1 year ago

Interesting find! I too was confused by not seeing clear speedups when changing from a T4 to a V100 with the demo notebook. Just assumed the autoregressive nature of the model means there's a loop around the forward pass, unamenable to GPU parallelism.

Pozaza commented 1 year ago

You need to reinstall torch

Redskull-127 commented 1 year ago

Hey this might be helpful https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-for-python.html

Ig this is a very weird problem of intel drivers. You may find many custom Library that you need to replace in code

zaptrem commented 1 year ago

@Pozaza @Redskull-127 All of that is managed by our cloud provider (with the exception of the 3090 from which the screenshot originates). However, seeing as our other models do not encounter this issue I think the cause is more likely related to the specifics of MusicGen/AudioCraft. Is there something special about MusicGen that relies on Intel's Python distro, for example?

niatro commented 1 year ago

Same Issue here for a 4090 GPU, is not been used. Only CPU.

zaptrem commented 1 year ago

Same Issue here for a 4090 GPU, is not been used. Only CPU.

This doesn't show much. You need to expose logical processors and use Afterburner or similar to track actual GPU usage.

niatro commented 1 year ago

Yep, u are right. Anyway I uninstalled audiocraft and installed again, I made sure that I create a good environment with conda. Unfortunately the guide in the repository is not straight forward but finally after I made an environment with Python 3.9, PyTorch 2.0.0 and ffmpeg and cloned the repository again and all the project worked fine. This issue is closed for me. Thanks

Redskull-127 commented 1 year ago

@zaptrem strange :(

zaptrem commented 1 year ago

Yep, u are right. Anyway I uninstalled audiocraft and installed again, I made sure that I create a good environment with conda. Unfortunately the guide in the repository is not straight forward but finally after I made an environment with Python 3.9, PyTorch 2.0.0 and ffmpeg and cloned the repository again and all the project worked fine. This issue is closed for me. Thanks

Can you post a screenshot of your logical processor (e.g., individual cores/hyperthreads) and GPU utilization graphs during inference?

mepc36 commented 1 year ago

I am seconding @zaptrem's request to @niatro to please post the following, it'd be a huge help!

Can you post a screenshot of your logical processor (e.g., individual cores/hyperthreads) and GPU utilization graphs during inference?

niatro commented 1 year ago

Do you mean this graph?

And this graph?

During inference time

Redskull-127 commented 1 year ago

Yep, u are right. Anyway I uninstalled audiocraft and installed again, I made sure that I create a good environment with conda. Unfortunately the guide in the repository is not straight forward but finally after I made an environment with Python 3.9, PyTorch 2.0.0 and ffmpeg and cloned the repository again and all the project worked fine. This issue is closed for me. Thanks

looks like you're right man! Thanks for the support.

zaptrem commented 1 year ago

@niatro Close, can you right click the CPU graph and select Change graph to > Logical processors? I'm trying to figure out what your single core utilization looks like. Also, are you running this directly on Windows or using WSL? Also, are you sure you were using Torch 2.0.0 and not 2.0.1? I reinstalled these versions and it made no difference.

zeke-john commented 7 months ago

@carlthome Any update? Currently facing the same issue.