Closed alexkrohn closed 1 year ago
Digging a bit deeper, it seems likely that locator only works on CUDA-enabled GPUs, not that it is merely recommended. Is that true?
I was able to get the install a bit further on another Linux machine with an XFX R7800 GPU, but alas, because it is not NVIDIA, it cannot be CUDA enabled. Thus tensorflow says they cannot find cuda drivers, so the GPU will not be used, then locator.py fails at line 109.
It sounds like without purchasing a cloud instance of a GPU, I'm out of luck in being able to run locator. Am I mistaken here?
Hi Alex,
Locator does not require a CUDA GPU — I used CPU on an intel mac a bunch during development. I’ll look in to your issue this week.
CJ
Yes I can confirm we can run locator on CPUs.
Can you give us a few more details on your python environment? Did you use conda to set up a virtual environment? what python version are you trying to use?
Hi all. Thanks for the quick replies! Good to know that a GPU is not a requirement.
All three installs were made in a new conda environment. Here are the packages and versions in each environment:
Mac:
# packages in environment at /Users/alexkrohn/anaconda3/envs/locator:
#
# Name Version Build Channel
_tflow_select 2.2.0 eigen
abseil-cpp 20211102.0 he9d5cce_0
absl-py 1.4.0 py38hecd8cb5_0
aiohttp 3.8.3 py38h6c40b1e_0
aiosignal 1.2.0 pyhd3eb1b0_0
appdirs 1.4.4 pyhd3eb1b0_0
astunparse 1.6.3 py_0
async-timeout 4.0.2 py38hecd8cb5_0
attrs 22.1.0 py38hecd8cb5_0
blas 1.0 mkl
blinker 1.4 py38hecd8cb5_0
brotlipy 0.7.0 py38h9ed2024_1003
c-ares 1.19.0 h6c40b1e_0
ca-certificates 2023.05.30 hecd8cb5_0
cachetools 4.2.2 pyhd3eb1b0_0
certifi 2023.5.7 py38hecd8cb5_0
cffi 1.15.1 py38hc55c11b_0
charset-normalizer 2.0.4 pyhd3eb1b0_0
click 8.0.4 py38hecd8cb5_0
cryptography 39.0.1 py38hf6deb26_0
flatbuffers 2.0.0 h23ab428_0
frozenlist 1.3.3 py38h6c40b1e_0
gast 0.4.0 pyhd3eb1b0_0
giflib 5.2.1 h6c40b1e_3
google-auth 2.6.0 pyhd3eb1b0_0
google-auth-oauthlib 0.5.2 py38hecd8cb5_0
google-pasta 0.2.0 pyhd3eb1b0_0
grpc-cpp 1.48.2 h3afe56f_0
grpcio 1.48.2 py38h3afe56f_0
h5py 3.7.0 py38h4a1dd59_0
hdf5 1.10.6 h10fe05b_1
icu 68.1 h23ab428_0
idna 3.4 py38hecd8cb5_0
importlib-metadata 6.0.0 py38hecd8cb5_0
intel-openmp 2023.1.0 ha357a0b_43547
jpeg 9e h6c40b1e_1
keras 2.12.0 py38_0
keras-preprocessing 1.1.2 pyhd3eb1b0_0
krb5 1.20.1 hdba6334_1
libcurl 8.1.1 ha585b31_1
libcxx 14.0.6 h9765a3e_0
libedit 3.1.20221030 h6c40b1e_0
libev 4.33 h9ed2024_1
libffi 3.3 hb1e8313_2
libgfortran 5.0.0 11_3_0_hecd8cb5_28
libgfortran5 11.3.0 h9dfd629_28
libnghttp2 1.52.0 h1c88b7d_1
libpng 1.6.39 h6c40b1e_0
libprotobuf 3.20.3 hfff2838_0
libssh2 1.10.0 hdb2fb19_2
llvm-openmp 14.0.6 h0dcd299_0
markdown 3.4.1 py38hecd8cb5_0
markupsafe 2.1.1 py38hca72f7f_0
mkl 2023.1.0 h59209a4_43558
mkl-service 2.4.0 py38h6c40b1e_1
mkl_fft 1.3.6 py38h07fba90_1
mkl_random 1.2.2 py38h07fba90_1
multidict 6.0.2 py38hca72f7f_0
ncurses 6.4 hcec6c5f_0
numpy 1.23.5 py38h47b59a4_1
numpy-base 1.23.5 py38hcfaf2c3_1
oauthlib 3.2.2 py38hecd8cb5_0
openssl 1.1.1u hca72f7f_0
opt_einsum 3.3.0 pyhd3eb1b0_1
packaging 23.0 py38hecd8cb5_0
pip 23.1.2 py38hecd8cb5_0
pooch 1.4.0 pyhd3eb1b0_0
protobuf 3.20.3 py38hcec6c5f_0
pyasn1 0.4.8 pyhd3eb1b0_0
pyasn1-modules 0.2.8 py_0
pycparser 2.21 pyhd3eb1b0_0
pyjwt 2.4.0 py38hecd8cb5_0
pyopenssl 23.0.0 py38hecd8cb5_0
pysocks 1.7.1 py38_1
python 3.8.5 h26836e1_1
python-flatbuffers 2.0 pyhd3eb1b0_0
re2 2022.04.01 he9d5cce_0
readline 8.2 hca72f7f_0
requests 2.29.0 py38hecd8cb5_0
requests-oauthlib 1.3.0 py_0
rsa 4.7.2 pyhd3eb1b0_1
scipy 1.10.1 py38hf241641_1
setuptools 67.8.0 py38hecd8cb5_0
six 1.16.0 pyhd3eb1b0_1
snappy 1.1.9 he9d5cce_0
sqlite 3.41.2 h6c40b1e_0
tbb 2021.8.0 ha357a0b_0
tensorboard 2.12.1 py38_0
tensorboard-data-server 0.7.0 py38h7242b5c_0
tensorboard-plugin-wit 1.6.0 py_0
tensorflow 2.12.0 eigen_py38h08ec666_0
tensorflow-base 2.12.0 eigen_py38hbf87084_0
tensorflow-estimator 2.12.0 py38_0
termcolor 2.1.0 py38hecd8cb5_0
tk 8.6.12 h5d9f67b_0
typing_extensions 4.6.3 py38hecd8cb5_0
urllib3 1.26.16 py38hecd8cb5_0
werkzeug 2.2.3 py38hecd8cb5_0
wheel 0.35.1 pyhd3eb1b0_0
wrapt 1.14.1 py38hca72f7f_0
xz 5.4.2 h6c40b1e_0
yarl 1.8.1 py38hca72f7f_0
zipp 3.11.0 py38hecd8cb5_0
zlib 1.2.13 h4dc903c_0
Ubuntu 20.04:
# packages in environment at /home/tangled/miniconda3/envs/locator:
#
# Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 2_gnu conda-forge
absl-py 1.4.0 pypi_0 pypi
asciitree 0.3.3 pypi_0 pypi
astunparse 1.6.3 pypi_0 pypi
bzip2 1.0.8 h7f98852_4 conda-forge
ca-certificates 2023.5.7 hbcca054_0 conda-forge
cachetools 5.3.1 pypi_0 pypi
certifi 2023.5.7 pypi_0 pypi
charset-normalizer 3.1.0 pypi_0 pypi
click 8.1.3 pypi_0 pypi
cloudpickle 2.2.1 pypi_0 pypi
contourpy 1.1.0 pypi_0 pypi
cycler 0.11.0 pypi_0 pypi
dask 2023.6.0 pypi_0 pypi
entrypoints 0.4 pypi_0 pypi
fasteners 0.18 pypi_0 pypi
flatbuffers 23.5.26 pypi_0 pypi
fonttools 4.40.0 pypi_0 pypi
fsspec 2023.6.0 pypi_0 pypi
gast 0.4.0 pypi_0 pypi
google-auth 2.20.0 pypi_0 pypi
google-auth-oauthlib 1.0.0 pypi_0 pypi
google-pasta 0.2.0 pypi_0 pypi
grpcio 1.54.2 pypi_0 pypi
h5py 3.9.0 pypi_0 pypi
idna 3.4 pypi_0 pypi
importlib-metadata 6.7.0 pypi_0 pypi
jax 0.4.12 pypi_0 pypi
keras 2.12.0 pypi_0 pypi
kiwisolver 1.4.4 pypi_0 pypi
ld_impl_linux-64 2.40 h41732ed_0 conda-forge
libblas 3.9.0 17_linux64_openblas conda-forge
libcblas 3.9.0 17_linux64_openblas conda-forge
libclang 16.0.0 pypi_0 pypi
libffi 3.4.2 h7f98852_5 conda-forge
libgcc-ng 13.1.0 he5830b7_0 conda-forge
libgfortran-ng 13.1.0 h69a702a_0 conda-forge
libgfortran5 13.1.0 h15d22d2_0 conda-forge
libgomp 13.1.0 he5830b7_0 conda-forge
liblapack 3.9.0 17_linux64_openblas conda-forge
libnsl 2.0.0 h7f98852_0 conda-forge
libopenblas 0.3.23 pthreads_h80387f5_0 conda-forge
libsqlite 3.42.0 h2797004_0 conda-forge
libstdcxx-ng 13.1.0 hfd8a6a1_0 conda-forge
libuuid 2.38.1 h0b41bf4_0 conda-forge
libzlib 1.2.13 hd590300_5 conda-forge
locator 1.1 pypi_0 pypi
locket 1.0.0 pypi_0 pypi
markdown 3.4.3 pypi_0 pypi
markupsafe 2.1.3 pypi_0 pypi
matplotlib 3.7.1 pypi_0 pypi
ml-dtypes 0.2.0 pypi_0 pypi
ncurses 6.4 hcb278e6_0 conda-forge
numcodecs 0.11.0 pypi_0 pypi
numpy 1.22.3 py310h4ef5377_2 conda-forge
oauthlib 3.2.2 pypi_0 pypi
openssl 3.1.1 hd590300_1 conda-forge
opt-einsum 3.3.0 pypi_0 pypi
packaging 23.1 pypi_0 pypi
pandas 2.0.2 pypi_0 pypi
partd 1.4.0 pypi_0 pypi
pillow 9.5.0 pypi_0 pypi
pip 23.1.2 pyhd8ed1ab_0 conda-forge
protobuf 4.23.3 pypi_0 pypi
pyasn1 0.5.0 pypi_0 pypi
pyasn1-modules 0.3.0 pypi_0 pypi
pyparsing 3.1.0 pypi_0 pypi
python 3.10.11 he550d4f_0_cpython conda-forge
python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge
python_abi 3.10 3_cp310 conda-forge
pytz 2023.3 pypi_0 pypi
pyyaml 6.0 pypi_0 pypi
readline 8.2 h8228510_1 conda-forge
requests 2.31.0 pypi_0 pypi
requests-oauthlib 1.3.1 pypi_0 pypi
rsa 4.9 pypi_0 pypi
scikit-allel 1.3.6 pypi_0 pypi
scipy 1.10.1 pypi_0 pypi
seaborn 0.12.2 pypi_0 pypi
setuptools 67.7.2 pyhd8ed1ab_0 conda-forge
six 1.16.0 pyh6c4a22f_0 conda-forge
tensorboard 2.12.3 pypi_0 pypi
tensorboard-data-server 0.7.1 pypi_0 pypi
tensorflow 2.12.0 pypi_0 pypi
tensorflow-estimator 2.12.0 pypi_0 pypi
tensorflow-io-gcs-filesystem 0.32.0 pypi_0 pypi
termcolor 2.3.0 pypi_0 pypi
tk 8.6.12 h27826a3_0 conda-forge
toolz 0.12.0 pypi_0 pypi
tqdm 4.65.0 pypi_0 pypi
typing-extensions 4.6.3 pypi_0 pypi
tzdata 2023.3 pypi_0 pypi
urllib3 1.26.16 pypi_0 pypi
werkzeug 2.3.6 pypi_0 pypi
wheel 0.40.0 pyhd8ed1ab_0 conda-forge
wrapt 1.14.1 pypi_0 pypi
xz 5.2.6 h166bdaf_0 conda-forge
zarr 2.15.0 pypi_0 pypi
zipp 3.15.0 pypi_0 pypi
On the Mac I changed the Python version to 3.8.5 based on some recommendations to get M2 chips running with TensorFlow. No luck there or with the default 3.10 version.
I don't have an M2 device to help debug but the linux setup should be simple enough. I'm guessing this is a tf install issue and not locator per se. Will tensorflow import without issue? e.g. python -c 'import tensorflow; print(tensorflow.__version__)
I agree that tf is the likely culprit. Running python -c 'import tensorflow; print(tensorflow.__version__)'
gives an Illegal instruction (core dumped)
error.
yep there you go.
how did you install cuda
, cudatoolkit
, etc? guessing there is a mismatch somewhere... what does the output of nvidia-smi
look like?
nvidia-smi
is not installed, I assume because I don't have a GPU.
I think my problems arise in how pip installs the requirements and resolves package conflicts.
I deleted the original locator environment (conda env remove --name locator
), then recreated one from scratch.
After running pip install -r req.txt
I see that all required packages are already installed. pip list
lists the packages as I noted them in my first post, including tensorflow
2.12. (I thought that pip install
would be limited to the conda environment, now deleted, but that doesn't appear true.) Somehow, python is not installed in this new conda environment, so I figured I'd start over from scratch using conda instead of pip for the installs.
I ran for i in $(cat req.txt) ; do conda install "$i" -y ; done
. There was one conflict (numpy
installs python 3.11, but scikit-allel
requires python<3.11, so I installed python=3.10, then re-ran the install for scikit-allel
). Now tensorflow for CPUs works.
$ python -c 'import tensorflow; print(tensorflow.__version__)'
2023-06-21 10:36:29.228848: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2.11.0
$ python -V
Python 3.10.11
locator.py
now works with the sample data too.
NB: conda install
for all these packages without version numbers is very slow. mamba install
works much quicker.
great! closing this as resolved
I've tried and failed to properly install Locator on both a new Mac and my lab's old Linux server. I assume this is a problem with my GPU hardware. Or is locator able to use
tensorflow-cpu
?The Mac is running OS 13.3.1, with M2 Pro Chip and standard 16 core GPU.
The Linux machine is running Ubuntu 20.04, has 80 x86_64 Intel CPUs, but no GPU.
On both machines following the default instructions and installing with pip in a new conda environment ends without any errors. Running
python scripts/locator.py -h
on the Linux machine yields the error: Illegal instruction (core dumped).Running
python scripts/locator.py -h
on the Mac yields:pip list
for the Linux:pip list
for the Mac:Any idea why the install fails?