Linaqruf / kohya-trainer

Adapted from https://note.com/kohya_ss/n/nbf7ce8d80f29 for easier cloning
Apache License 2.0
1.84k stars 304 forks source link

BLIP stopped working & solution. #173

Closed teebarjunk closed 1 year ago

teebarjunk commented 1 year ago

Got this error:

/usr/local/lib/python3.9/dist-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: '/usr/local/lib/python3.9/dist-packages/torchvision/image.so: undefined symbol: _ZN3c104cuda20CUDACachingAllocator9allocatorE'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
  warn(
src/tcmalloc.cc:283] Attempt to free invalid pointer 0x7fc1394c6558 

Asked Chat-GPT, which gave this solution:

!apt-get install libjpeg-dev libpng-dev
!pip install --upgrade --force-reinstall torchvision

Which led to this error:

WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
    PyTorch 1.13.1+cu117 with CUDA 1107 (you have 2.0.0+cu117)
    Python  3.9.16 (you have 3.9.16)
  Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
  Memory-efficient attention, SwiGLU, sparse and more won't be available.
  Set XFORMERS_MORE_DETAILS=1 for more details

So Chat-GPT gave me:

pip install xformers

BLIP worked again after running that.

Linaqruf commented 1 year ago

Thanks for reporting, I pushed the fix few minutes ago. Colab updating pytorch to 2.0 so we can't use xformers 0.0.16 and below.

So I fix it by updating the version to 0.0.18. But this is still temporary fix, because I still haven't tested the notebook after torch 2.0.0 update, I'll do it tomorrow.

teebarjunk commented 1 year ago

Thank you for everything you've done. This tool has been great.