ApluUalberta commented 3 years ago

Set up issue

Hi there,

I tried to create appropriate labels for the specific issue that I'm experiencing, but I couldn't find a way to put the proper tags, so i apologize in advance.

I’m currently trying to set up Pytorch Lightning on Jetson Nano/Jetson Xavier NX by building from source. So far, I have tried following this thread here: #695

The requirements.txt has been changed and no longer has torchvision and scikit-learn as one of the requirements. However, it seems to seek a torch version>=1.4 as a result of torchmetrics>=0.2.0 (within requirements.txt). My Jetson even has torch 1.8.0 and torchvision in its pip3 package manager. I was wondering if anyone has successfully set Pytorch Lightning up on ARM64's new requirement layout. Thanks!

Is there something I’m missing? I have also tried running setup.py to no avail. Thanks!

Expected behavior

My pip install attempt: Selection_001

my pip install requirements attempt: Selection_004

my pip list: Selection_005

Environment

This is not a script bug as I'm only having trouble with setup using the pip package manager on Jetson/ARM64. Once again, I'm sorry for the tags!

PyTorch Version (e.g., 1.0): 1.8.0
OS (e.g., Linux): Linux
How you installed PyTorch (conda, pip, source): pip
Build command you used (if compiling from source): pip3 install pytorch-lighting/pip3 install -r requirements.txt
Python version: 3.6.9
CUDA/cuDNN version: CUDA 10.2, cuDNN 8.0.0
GPU models and configuration:
Any other relevant information:

Additional context

loic-beheshti commented 3 years ago

Hi,

I actually never tried to set up Pytorch Lightning directly on a Jetson but people are usually more interested in the deployment side with such devices. If this is your end goal, I suggest you look into this project https://github.com/neo-ai/neo-ai-dlr.

I had a few issues compiling on sagemaker neo as a pytorch model so I suggest you convert your model to .onnx first. Depending on the Pytorch Lightning version you're using, you may also face issues with DLR, in the worst case scenario, I suggest you convert your Pytorch Lightning model into to a classic pytorch one once trained (you mainly need to slightly modify the state dictionary).

This is how I personally deploy on Jetson devices from Pytorch Lightning modules. If this is what you're looking for, I would be happy to share more details.

ApluUalberta commented 3 years ago

orch Lightning directly on a Jetson but people are usually more interested in the deployment side with such devices. If

Thank you for the response! But I have already successfully deployed with ONNX and Libtorch C++ on the Jetson Xavier. Now, I specifically need to have the option of training on it. Thank you for the offer and project reference though, I'll definitely take a look!

ApluUalberta commented 3 years ago

I have fixed the issue but there are still some potential questions for future ARM64 developers that may be important. I may explore this question further.

The problem seemed to be that either the torch version that I possessed was not 1.4.0, or that I needed to use pip instead of pip3. For some reason, pip installed for both pip python 2 and pip3. I would invite other people to evaluate this further. This is a rundown of my documentation on the process, and will be subject to change should future updates to this library, and Jetson Hardware be put into place:

Pytorch Lightning Setup on Jetson Xavier NX

The general theme seems that we need to install pytorch lightning with the Pip package manager instead of pip3

and/or that we need to have torch 1.4.0 with its corresponding torchvision in order to pass the torch>=1.4 requirement

Not sure which, but it seemsl ike having both also works

1. We need to retrieve the Pytorch Variant 1.4.0 and torchvision 0.5.0 (Torch needs to bne >=1.4 but it seems 1.4.0 may be necessary). Prior to this, we already possessed pip3 and a torch and torchvision installation corresponding to that.

$ wget https://nvidia.box.com/shared/static/1v2cc4ro6zvsbu0p8h6qcuaqco1qcsif.whl -O torch-1.4.0-cp27-cp27mu-linux_aarch64.whl

$ git clone --branch v0.5.0 https://github.com/pytorch/vision torchvision   # see below for version of torchvision to download

2. We will need both pip and pip3 to install pytorch lightning

$ sudo apt-get update
$ sudo apt-get install libhdf5-serial-dev hdf5-tools libhdf5-dev zlib1g-dev zip libjpeg8-dev liblapack-dev libblas-dev gfortran
$ sudo apt-get install python-pip
$ sudo pip3 install -U pip testresources setuptools==49.6.0 # This step wasn't necessary but was done during the installation process for python 2 pip

3. Once we have the pip package manager, we need to install torch 1.4.0 on out python 2 environment. We also need to edit the Requirements.txt within pytorch-lightning environment

$ cd torchvision
$ export BUILD_VERSION=0.5.0 
$ python3 setup.py install --user
$ cd path/to/pytorch-lightning

open requirements.txt..
Comment out the torch>=1.4 constraint as follows:
numpy>=1.17.2
#torch>=1.4 <---------------------------------------------------------------
future>=0.17.1  # required for builtins in setup.py
tqdm>=4.41.0
PyYAML>=5.1,<=5.4.1
fsspec[http]>=2021.4.0
tensorboard>=2.2.0, !=2.5.0  # 2.5.0 GPU CI error: 'Couldn't build proto file into descriptor pool!'
torchmetrics>=0.2.0
pyDeprecate==0.3.0
packaging

4. Install (The pip install seems to also install for pip3 manager)

$ pip install pytorch-lightning # Valid for both pip and pip3

5. We can continue to verify with pip list

$ pip3 list
$ pip list

It's difficult to screencap these results using shutter with highlighted similarities, but your pip and pip3 list should look the same with the following and the same versions: Selection_013 Selection_014

Sources

Pytorch and Torchvision installation for Python 2 and 3: https://forums.developer.nvidia.com/t/pytorch-for-jetson-version-1-8-0-now-available/72048

Python 2 pip installation: https://docs.nvidia.com/deeplearning/frameworks/install-tf-jetson-platform/index.html Pytorch Lightning forum post (seems correct to a degree but non-working for us): https://forums.developer.nvidia.com/t/pytorch-lightning-set-up-on-jetson-nano-xavier-nx/177329

Pytorch Lightning installation on Jetson Xavier NX - USAEng, Note.com: fullsize-en

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions, Pytorch Lightning Team!

Lightning-AI / pytorch-lightning

Set up on Jetson Nano/Xavier NX #7408

Set up issue

Expected behavior

Environment

Additional context

Pytorch Lightning Setup on Jetson Xavier NX

The general theme seems that we need to install pytorch lightning with the Pip package manager instead of pip3

1. We need to retrieve the Pytorch Variant 1.4.0 and torchvision 0.5.0 (Torch needs to bne >=1.4 but it seems 1.4.0 may be necessary). Prior to this, we already possessed pip3 and a torch and torchvision installation corresponding to that.

2. We will need both pip and pip3 to install pytorch lightning

3. Once we have the pip package manager, we need to install torch 1.4.0 on out python 2 environment. We also need to edit the Requirements.txt within pytorch-lightning environment

4. Install (The pip install seems to also install for pip3 manager)

5. We can continue to verify with pip list

Sources