isl-org / lang-seg

Language-Driven Semantic Segmentation
MIT License
720 stars 91 forks source link

Error with Pytorch Encoding #41

Open zixinglin07 opened 1 year ago

zixinglin07 commented 1 year ago

I am running windows and I have issues with installing this project, specifically for the torch encoding package.

Some primary error include: error: ninja: error: loading 'build.ninja': The system cannot find the file specified. and Error building extension 'enclib_cpu'

These errors are usually in tandem with a giant list of other errors presumable in dependencies. When I tried to build the package via Docker, similar issues arose as well.

Things I have tried:

  1. I have ensured that visual studios and the C++ compilers are properly installed, and environmental variables are set
  2. I have installed cudatoolkit along with various pytorch libraries with cuda support (torch.cuda.is_available() returns True)
  3. Tried to build a docker image by following the Pytorch Encoding installation guide, but error still occurs

I am running on a Windows 10 machine.

Are there any fixes or guides to get lang-seg to work under these circumstances?

HuadongTang commented 1 year ago

3. Tried to build a docker image by following the Pytorch Encoding installation guide, but error still occur same problems

qiuzhen8484 commented 1 year ago

Ditto. Error message: error: class template "ScalarConvert" has already been defined

HarryCookson commented 1 year ago

Ditto with Error building extension 'enclib_cpu'

Possibly something that was deprecated in an old version of Pytorch? Have seen this issue on another repo about three and a half years ago. No idea if it has any relevance, but just in case I've linked it.

XiShuFan commented 1 year ago

Ditto. Error message: error: class template "ScalarConvert" has already been defined

Have you solved this problem? Thank you!

qiuzhen8484 commented 1 year ago

Ditto. Error message: error: class template "ScalarConvert" has already been defined

Have you solved this problem? Thank you!

Not yet. I have got no idea to solve it.

geyanqi commented 1 year ago

same issue.

robin-karlsson0 commented 1 year ago

After encountering the same ""ScalarConvert" has already been defined" issue I finally managed to install PyTorch-Encoding. Worked by building from source with a recent pytorch and CUDA version matching my installed system CUDA version.

python 3.9.17
cuda 11.8
pytorch 2.0.0+cu118

# Installing PyTorch-Encoding
git clone https://github.com/zhanghang1989/PyTorch-Encoding && cd PyTorch-Encoding
python setup.py install

Ref: https://github.com/zhanghang1989/PyTorch-Encoding/pull/418 Ref: https://hangzhang.org/PyTorch-Encoding/notes/compile.html

Have not yet tried running the LSeg code with this setup. Will do soon.

robin-karlsson0 commented 1 year ago

Could not run the code as I got an undefined symbol error when trying to import any modules from Pytorch-Encoding. Tried to install the library on an old node with cuda 10.2 but that didn't work either because of a gcc version incompatibility.

RuntimeError: The current installed version of g++ (9.4.0) is greater than the maximum required version by CUDA 10.2 (8.0.0). Please make sure to use an adequate version of g++ (>=5.0.0, <=8.0.0).
TianhangXiang commented 1 year ago

same issue.

TianhangXiang commented 1 year ago

Hi guys, I think I have successfully solved the issue! I installed the packages mentioned in the requirements.txt with Pytorch 1.9 and cuda11.1. The newly released PyTorch encoding is not compatible with PyTorch 1.9. So, I rolled back to an earlier commit and successfully installed the package.

Could not run the code as I got an undefined symbol error when trying to import any modules from Pytorch-Encoding. Tried to install the library on an old node with cuda 10.2 but that didn't work either because of a gcc version incompatibility.

RuntimeError: The current installed version of g++ (9.4.0) is greater than the maximum required version by CUDA 10.2 (8.0.0). Please make sure to use an adequate version of g++ (>=5.0.0, <=8.0.0).
DDPYZ commented 7 months ago

i have the same issue is anyone solved it? I think it's a version incompatibility issue.

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-10-bb83344f9863> in <module>
     11 from torch.nn.parallel.scatter_gather import gather
     12 
---> 13 import encoding.utils as utils
     14 from encoding.nn import SegmentationLosses, SyncBatchNorm
     15 from encoding.parallel import DataParallelModel, DataParallelCriterion

[c:\Users\Frain\.conda\envs\Lseg\lib\site-packages\encoding\__init__.py](file:///C:/Users/Frain/.conda/envs/Lseg/lib/site-packages/encoding/__init__.py) in <module>
     11 """An optimized PyTorch package with CUDA backend."""
     12 from .version import __version__
---> 13 from . import nn, functions, parallel, utils, models, datasets, transforms

[c:\Users\Frain\.conda\envs\Lseg\lib\site-packages\encoding\nn\__init__.py](file:///C:/Users/Frain/.conda/envs/Lseg/lib/site-packages/encoding/nn/__init__.py) in <module>
     10 
     11 """Encoding NN Modules"""
---> 12 from .encoding import *
     13 from .syncbn import *
     14 from .customize import *

[c:\Users\Frain\.conda\envs\Lseg\lib\site-packages\encoding\nn\encoding.py](file:///C:/Users/Frain/.conda/envs/Lseg/lib/site-packages/encoding/nn/encoding.py) in <module>
     16 from torch.nn.modules.utils import _pair
     17 
---> 18 from ..functions import scaled_l2, aggregate, pairwise_cosine
...
--> 297 
    298     encoding = None
    299     if 'b' not in mode:

ImportError: No module named 'enclib_cpu'
DDPYZ commented 7 months ago

Hi guys, I think I have successfully solved the issue! I installed the packages mentioned in the requirements.txt with Pytorch 1.9 and cuda11.1. The newly released PyTorch encoding is not compatible with PyTorch 1.9. So, I rolled back to an earlier commit and successfully installed the package.

Could not run the code as I got an undefined symbol error when trying to import any modules from Pytorch-Encoding. Tried to install the library on an old node with cuda 10.2 but that didn't work either because of a gcc version incompatibility.

RuntimeError: The current installed version of g++ (9.4.0) is greater than the maximum required version by CUDA 10.2 (8.0.0). Please make sure to use an adequate version of g++ (>=5.0.0, <=8.0.0).

hello,can you tell me which version of pytorch you rolled back?

TianhangXiang commented 6 months ago

Hi guys, I think I have successfully solved the issue! I installed the packages mentioned in the requirements.txt with Pytorch 1.9 and cuda11.1. The newly released PyTorch encoding is not compatible with PyTorch 1.9. So, I rolled back to an earlier commit and successfully installed the package.

Could not run the code as I got an undefined symbol error when trying to import any modules from Pytorch-Encoding. Tried to install the library on an old node with cuda 10.2 but that didn't work either because of a gcc version incompatibility.

RuntimeError: The current installed version of g++ (9.4.0) is greater than the maximum required version by CUDA 10.2 (8.0.0). Please make sure to use an adequate version of g++ (>=5.0.0, <=8.0.0).

hello,can you tell me which version of pytorch you rolled back? It has been a long time ago... I think I just installed the PyTorch Encoding in the https://github.com/zhanghang1989/PyTorch-Encoding/tree/331ecdd5306104614cb414b16fbcd9d1a8d40e1e which is not the latest version and the problem was solved. The PyTorch version is 1.9 still.