hpcaitech / ColossalAI-Examples

Examples of training models with hybrid parallelism using ColossalAI
Apache License 2.0
334 stars 102 forks source link

ModuleNotFoundError: No module named 'colossal_layer_norm_cuda' #94

Open 200987299 opened 2 years ago

200987299 commented 2 years ago

Dear developers,

I am trying to run the bert example but I got this error, any hint to fix it?

Thanks

FrankLeeeee commented 2 years ago

Hi, this is because you CUDA extension is not built. Can you make sure your PyTorch uses the same cuda version as your cuda runtime. One way is to add the -v flag to your pip install command. If cuda extension is not built, the log will tell you the info about the version mismatch.

200987299 commented 2 years ago

Hi, this is because you CUDA extension is not built. Can you make sure your PyTorch uses the same cuda version as your cuda runtime. One way is to add the -v flag to your pip install command. If cuda extension is not built, the log will tell you the info about the version mismatch.

Hi, could you be more specific? I tried few times reinstalling all the stuff, but I failed every time. The pip -v install shown everything was fine.

Thanks

binmakeswell commented 2 years ago

Hi @200987299 We have updated and simplified the installation process, and the example code has been updated. You can try to reinstall, thanks.