Closed dbuscombe-usgs closed 1 year ago
I believe this is my recipe for install on Windows:
libmamba
as the default environment solver in base environment following this.python=3.9
instead of 3.8
. It may work with 3.10
but I have not tested:
conda create -n gym python=3.9
conda activate gym
conda install -c conda-forge scipy "numpy>=1.16.5, <=1.23.0" scikit-image cython ipython joblib tqdm pandas pip plotly natsort pydensecrf matplotlib
pip install doodleverse_utils transformers
tensorflow
install instructions for installing on Windows Native OS:
conda install -c conda-forge cudatoolkit=11.2 cudnn=8.1.0
# Anything above 2.10 is not supported on the GPU on Windows Native
python -m pip install "tensorflow<2.11"
# Verify install:
python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
Hope this works for you!
Thanks, I will give this a try!
This reminds me I need to nuke the 'pydensecrf' requirement from the docs
I successfully installed the conda env, but it doesn't work. I get the same error
2023-02-24 11:51:04.512581: W tensorflow/compiler/xla/service/gpu/nvptx_helper.cc:56] Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result in compilation or runtime failures, if the program we try to run uses routines from libdevice.
Searched for CUDA in the following directories:
./cuda_sdk_lib
C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.2
/usr/local/cuda
I didnt first make sure NVIDIA drivers were up to date. I don't know how to do this, and dont remember ever having to do this before
Try finding the driver here: https://www.nvidia.com/Download/index.aspx?lang=en-us
FYI:
(gym) PS E:\Python\segmentation_gym> nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:12:52_Pacific_Daylight_Time_2019
Cuda compilation tools, release 10.1, V10.1.243
Ugh, windows
conda install -n base conda-libmamba-solver
fails too
Sucky. Maybe fresh miniconda install?
Hmmm. I shouldnt need to update my drivers, or reinstall conda. That would be too disruptive for me. I'm going to see if I can figure out a conda solution
FYI:
(gym) PS E:\Python\segmentation_gym> conda info
active environment : gym
active env location : C:\Users\csb67\AppData\Local\miniconda3\envs\gym
shell level : 2
user config file : C:\Users\csb67\.condarc
populated config files : C:\Users\csb67\.condarc
conda version : 23.1.0
conda-build version : not installed
python version : 3.10.9.final.0
virtual packages : __archspec=1=x86_64
__cuda=12.0=0
__win=0=0
base environment : C:\Users\csb67\AppData\Local\miniconda3 (writable)
conda av data dir : C:\Users\csb67\AppData\Local\miniconda3\etc\conda
conda av metadata url : None
channel URLs : https://repo.anaconda.com/pkgs/main/win-64
https://repo.anaconda.com/pkgs/main/noarch
https://repo.anaconda.com/pkgs/r/win-64
https://repo.anaconda.com/pkgs/r/noarch
https://repo.anaconda.com/pkgs/msys2/win-64
https://repo.anaconda.com/pkgs/msys2/noarch
https://conda.anaconda.org/conda-forge/win-64
https://conda.anaconda.org/conda-forge/noarch
package cache : C:\Users\csb67\AppData\Local\miniconda3\pkgs
C:\Users\csb67\.conda\pkgs
C:\Users\csb67\AppData\Local\conda\conda\pkgs
envs directories : C:\Users\csb67\AppData\Local\miniconda3\envs
C:\Users\csb67\.conda\envs
C:\Users\csb67\AppData\Local\conda\conda\envs
platform : win-64
user-agent : conda/23.1.0 requests/2.28.1 CPython/3.10.9 Windows/10 Windows/10.0.19044 solver/libmamba conda-libmamba-solver/22.8.1 libmambapy/1.3.1
administrator : False
netrc file : None
offline mode : False
I do believe the change to python 3.10 was a significant one. I have had to reinstall miniconda on both of my Windows computers recently. For what it's worth!
I think I've now exhausted all options except updating nvidia or conda, which I'm not currently prepared to do. I suppose I will not make segformer models on windows
I was able to install miniconda and use Cam's mamba recipe to install a gym environment. It works with the Unets, but not the segformer model. I'll keep troubleshooting
I did notice in my PINGMapper.yml
that I list installing transformers
after tensorflow
. Perhaps an order of operations thing??
name: ping
channels:
- conda-forge
- defaults
dependencies:
- python
- pandas
- rasterio
- pyproj
- scikit-image
- joblib
- gdal
- matplotlib
- pip
- pip:
- psutil
- tensorflow
- transformers
With miniconda and mamba, I can now get a working environment for training any unet models, in python 3.8, 3.9, and 3.10. I can do this using either the pip or conda way of installing TF, or the conda-forge way.
The only issue is using SegFormer models. It errors out with the same message every time. It doesn't matter if I install transformers
using pip or conda, before or after TF
I have not been able to update my nvidia drivers. I simply can't find a link that will allow me to install something to " C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.2", which is where transformers expects the cuda drivers to be (what am I missing?)
I thought one major advantage with installing TF the "conda-forge" route was not having to update nvidia drivers on windows.
If I attempt to install the 11.2 cuda toolkit, from here: https://developer.nvidia.com/cuda-11.2.0-download-archive?target_os=Windows&target_arch=x86_64&target_version=10&target_type=exelocal
I get warning messages saying that I'm about to downgrade NVIDIA versions, which doesn't seem right
On neither windows computer I have, both of which have gym environments working well for Unets, do I have access to the program nvcc
. I'm very reluctant to go this route right now, for fear that I will break my working conda envs
Eureka!! Add this to the conda env to make it use segformers
conda install cuda -c nvidia
I will update the README
Ok, I posted new conda recipes for Gym that allows for use of segformers on windows and ubuntu
Thanks @CameronBodine for helping troubleshoot and test!
https://github.com/Doodleverse/segmentation_gym#%EF%B8%8F-installation
the
segformer
model is now fully integrated by there remain some issues with the conda environmentIn https://github.com/Doodleverse/segmentation_gym/issues/115 @CameronBodine noted
I didn't get the error on my Ubuntu box but did on my Windows box. I'm looking for a conda env workaround