libffcv / ffcv

FFCV: Fast Forward Computer Vision (and other ML workloads!)
https://ffcv.io
Apache License 2.0
2.87k stars 179 forks source link

Installation issues #347

Open gustrain opened 1 year ago

gustrain commented 1 year ago

We're trying to replicate the results from the FFCV paper, and are having difficulty setting up a working environment. The suggested conda install command appears to hang (no progress after 2 hours, 100% CPU usage) using a fresh conda installation (as suggested by #85). The suggested troubleshooting tips for a conda install made no apparent change.

We were able to build and run the provided conda-less dockerfile, however we're still unable to use FFCV, as seen below.

root@ac2fad055eeb:/workspace# python
Python 3.8.10 (default, May 26 2023, 14:05:08)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import ffcv
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.8/dist-packages/ffcv/__init__.py", line 1, in <module>
    from .loader import Loader
  File "/usr/local/lib/python3.8/dist-packages/ffcv/loader/__init__.py", line 1, in <module>
    from .loader import Loader, OrderOption
  File "/usr/local/lib/python3.8/dist-packages/ffcv/loader/loader.py", line 14, in <module>
    from ffcv.fields.base import Field
  File "/usr/local/lib/python3.8/dist-packages/ffcv/fields/__init__.py", line 3, in <module>
    from .rgb_image import RGBImageField
  File "/usr/local/lib/python3.8/dist-packages/ffcv/fields/rgb_image.py", line 5, in <module>
    import cv2
  File "/usr/local/lib/python3.8/dist-packages/cv2/__init__.py", line 181, in <module>
    bootstrap()
  File "/usr/local/lib/python3.8/dist-packages/cv2/__init__.py", line 175, in bootstrap
    if __load_extra_py_code_for_module("cv2", submodule, DEBUG):
  File "/usr/local/lib/python3.8/dist-packages/cv2/__init__.py", line 28, in __load_extra_py_code_for_module
    py_module = importlib.import_module(module_name)
  File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/usr/local/lib/python3.8/dist-packages/cv2/typing/__init__.py", line 169, in <module>
    LayerId = cv2.dnn.DictValue
AttributeError: module 'cv2.dnn' has no attribute 'DictValue'

System details:

Any suggestions?

andrewilyas commented 1 year ago

Hi @gustrain ! What conda command are you using to install (the one that hangs?)

gustrain commented 1 year ago

Hi @andrewilyas -- thanks for the quick reply!

I'm running conda create -y -n ffcv python=3.9 cupy pkg-config libjpeg-turbo opencv pytorch torchvision cudatoolkit=11.3 numba -c pytorch -c conda-forge as suggested in the FFCV readme.

andrewilyas commented 1 year ago

Interesting, that command seems to work for me, with the difference being that I am on CUDA 11.6. I'm not 100% sure but there might be a compatibility issue between PyTorch 2.0 and CUDA 11.2 - can you try updating CUDA to 11.6 and see if the issue persists?

gustrain commented 1 year ago

I updated CUDA, but this unfortunately did not seem to make any difference. I'll see if it just needs a bit more time, but as of right now it just seems to be spinning on "Solving environment," as it was doing before.

How long should the installation take when successful?

andrewilyas commented 1 year ago

So it should terminate in like a day or so, but when things are working properly it usually takes 30 minutes. The super long installation is something we experienced a few versions ago but should have been fixed a while back now. What version of CUDA are you on now? If it's not too much trouble, can you try separating out the steps? So first installing pytorch using the instructions from pytorch.org, and then running conda install cupy pkg-config libjpeg-turbo opencv numba -c conda-forge?

aegonwolf commented 11 months ago

@andrewilyas thanks for the package! I was wondering if ffcv is now compatible with newer versions of python (3.11) and torch 2.0?