Closed ApluUalberta closed 3 years ago
Hi,
I actually never tried to set up Pytorch Lightning directly on a Jetson but people are usually more interested in the deployment side with such devices. If this is your end goal, I suggest you look into this project https://github.com/neo-ai/neo-ai-dlr.
I had a few issues compiling on sagemaker neo as a pytorch model so I suggest you convert your model to .onnx first. Depending on the Pytorch Lightning version you're using, you may also face issues with DLR, in the worst case scenario, I suggest you convert your Pytorch Lightning model into to a classic pytorch one once trained (you mainly need to slightly modify the state dictionary).
This is how I personally deploy on Jetson devices from Pytorch Lightning modules. If this is what you're looking for, I would be happy to share more details.
orch Lightning directly on a Jetson but people are usually more interested in the deployment side with such devices. If
Thank you for the response! But I have already successfully deployed with ONNX and Libtorch C++ on the Jetson Xavier. Now, I specifically need to have the option of training on it. Thank you for the offer and project reference though, I'll definitely take a look!
I have fixed the issue but there are still some potential questions for future ARM64 developers that may be important. I may explore this question further.
The problem seemed to be that either the torch version that I possessed was not 1.4.0, or that I needed to use pip instead of pip3. For some reason, pip installed for both pip python 2 and pip3. I would invite other people to evaluate this further. This is a rundown of my documentation on the process, and will be subject to change should future updates to this library, and Jetson Hardware be put into place:
and/or that we need to have torch 1.4.0 with its corresponding torchvision in order to pass the torch>=1.4 requirement
Not sure which, but it seemsl ike having both also works
$ wget https://nvidia.box.com/shared/static/1v2cc4ro6zvsbu0p8h6qcuaqco1qcsif.whl -O torch-1.4.0-cp27-cp27mu-linux_aarch64.whl
$ git clone --branch v0.5.0 https://github.com/pytorch/vision torchvision # see below for version of torchvision to download
$ sudo apt-get update
$ sudo apt-get install libhdf5-serial-dev hdf5-tools libhdf5-dev zlib1g-dev zip libjpeg8-dev liblapack-dev libblas-dev gfortran
$ sudo apt-get install python-pip
$ sudo pip3 install -U pip testresources setuptools==49.6.0 # This step wasn't necessary but was done during the installation process for python 2 pip
$ cd torchvision
$ export BUILD_VERSION=0.5.0
$ python3 setup.py install --user
$ cd path/to/pytorch-lightning
open requirements.txt..
Comment out the torch>=1.4 constraint as follows:
numpy>=1.17.2
#torch>=1.4 <---------------------------------------------------------------
future>=0.17.1 # required for builtins in setup.py
tqdm>=4.41.0
PyYAML>=5.1,<=5.4.1
fsspec[http]>=2021.4.0
tensorboard>=2.2.0, !=2.5.0 # 2.5.0 GPU CI error: 'Couldn't build proto file into descriptor pool!'
torchmetrics>=0.2.0
pyDeprecate==0.3.0
packaging
$ pip install pytorch-lightning # Valid for both pip and pip3
$ pip3 list
$ pip list
It's difficult to screencap these results using shutter with highlighted similarities, but your pip and pip3 list should look the same with the following and the same versions:
Pytorch and Torchvision installation for Python 2 and 3: https://forums.developer.nvidia.com/t/pytorch-for-jetson-version-1-8-0-now-available/72048
Python 2 pip installation: https://docs.nvidia.com/deeplearning/frameworks/install-tf-jetson-platform/index.html Pytorch Lightning forum post (seems correct to a degree but non-working for us): https://forums.developer.nvidia.com/t/pytorch-lightning-set-up-on-jetson-nano-xavier-nx/177329
Pytorch Lightning installation on Jetson Xavier NX - USAEng, Note.com:
This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions, Pytorch Lightning Team!
Set up issue
Hi there,
I tried to create appropriate labels for the specific issue that I'm experiencing, but I couldn't find a way to put the proper tags, so i apologize in advance.
I’m currently trying to set up Pytorch Lightning on Jetson Nano/Jetson Xavier NX by building from source. So far, I have tried following this thread here: #695
The requirements.txt has been changed and no longer has torchvision and scikit-learn as one of the requirements. However, it seems to seek a torch version>=1.4 as a result of torchmetrics>=0.2.0 (within requirements.txt). My Jetson even has torch 1.8.0 and torchvision in its pip3 package manager. I was wondering if anyone has successfully set Pytorch Lightning up on ARM64's new requirement layout. Thanks!
Is there something I’m missing? I have also tried running setup.py to no avail. Thanks!
Expected behavior
My pip install attempt:
my pip install requirements attempt:
my pip list:
Environment
This is not a script bug as I'm only having trouble with setup using the pip package manager on Jetson/ARM64. Once again, I'm sorry for the tags!
conda
,pip
, source): pipAdditional context