maickrau / GraphAligner

MIT License
255 stars 30 forks source link

CondaEnvironment_osx.yml hardcodes x86_64-specific versions of dependencies #64

Open asl opened 2 years ago

asl commented 2 years ago

These are obviously not available on M1 Macs:

Collecting package metadata (repodata.json): done
Solving environment: failed

ResolvePackageNotFound:
  - libffi==3.4.2=h0d85af4_5
  - openssl==3.0.2=h6c3fc93_1
  - cctools_osx-64==973.0.1=h2b735b3_10
  - clang==13.0.1=h694c41f_0
  - xz==5.2.5=haf1e3a3_1
  - zstd==1.4.9=h582d3a0_0
  - setuptools==60.10.0=py37hf985489_0
  - ld64_osx-64==609=hd77a64a_10
  - libcblas==3.9.0=13_osx64_openblas
  - clang_osx-64==13.0.1=h71a8856_0
  - zlib==1.2.11=h9173be1_1013
  - llvm-openmp==13.0.1=hcb1a161_1
  - sdsl-lite==2.1.1=h940c156_1002
  - libcxx==13.0.1=hc203e6f_0
  - libgfortran==5.0.0=9_3_0_h6c81a4c_23
  - libzlib==1.2.11=h9173be1_1013
  - libgfortran5==9.3.0=h6c81a4c_23
  - libiconv==1.16=haf1e3a3_0
  - libclang-cpp13==13.0.1=default_he082bbe_0
  - boost-cpp==1.68.0=h6f8c590_1000
  - lz4-c==1.9.3=he49afe7_1
  - pkg-config==0.29.2=h31203cd_1008
  - icu==58.2=h0a44026_1000
  - bzip2==1.0.8=h0d85af4_4
  - clangxx==13.0.1=default_he082bbe_0
  - sparsehash==2.0.4=he49afe7_0
  - jemalloc==5.2.1=he49afe7_6
  - ncurses==6.3=he49afe7_0
  - ca-certificates==2021.10.8=h033912b_0
  - libllvm13==13.0.1=h64f94b2_2
  - tk==8.6.12=h5dbffcc_0
  - compiler-rt==13.0.1=he01351e_0
  - readline==8.1=h05e3726_0
  - libprotobuf==3.19.4=hcf210ce_0
  - libjemalloc==5.2.1=he49afe7_6
  - numpy==1.21.5=py37h3c8089f_0
  - protobuf==3.19.4=py37hd8d24ac_0
  - clangxx_osx-64==13.0.1=heae0f87_0
  - llvm-tools==13.0.1=h64f94b2_2
  - libblas==3.9.0=13_osx64_openblas
  - libprotobuf-static==3.19.4=hcf210ce_0
  - libopenblas==0.3.18=openmp_h3351f45_0
  - python==3.7.12=hf3644f1_100_cpython
  - liblapack==3.9.0=13_osx64_openblas
  - sigtool==0.1.3=h88f4db0_0
  - boost==1.68.0=py37h9888f84_1001
  - libboost==1.73.0=hd4c2dcd_11
  - sqlite==3.37.1=hb516253_0
  - tapi==1100.0.11=h9ce4665_0
  - clang-13==13.0.1=default_he082bbe_0

Don't think these was intentional as clearly things like libblas are some transitive deps

maickrau commented 2 years ago

The environment file was generated by conda export which apparently is machine specific then. You can try to have conda figure out the dependencies into your environment with: conda install make clangxx_osx-64 jemalloc=5.2.0 zlib boost=1.67.0 libboost=1.67.0 sparsehash pkg-config libdivsufsort protobuf=3.14.0 libprotobuf-static=3.14.0 sdsl-lite the versions for jemalloc and boost have to be those specific ones because newer versions have issues with GraphAligner but the protobuf version is not important as long as both protobuf and libprotobuf-static have the exact same version. The other alternative is that you can try to install the dependencies manually which is unfortunately a pain for protobuf.

asl commented 2 years ago

Checking the deps:

  1. It seems that jemalloc is optional. Could simply be dropped. Or replaced by e.g. mimalloc / tcmalloc
  2. Boost is only used to parse program options. Maybe worth replacing by some other library? :)
  3. Why do you need clang from conda? Why the system one is not enough?
  4. Is sparsehash really used for anything? Same for divsufsort (I was unable to find any usage for divsufsort / divbwt).
maickrau commented 2 years ago
  1. Yes, this should be possible. jemalloc was the fastest option based on evaluations a couple of years ago but this might have changed since then.
  2. Could be
  3. So that clang finds the dependencies installed by conda. Same applies for pkg-config. If you can configure your system clang to search your conda environment then I think you could use that one.
  4. These might be leftovers from previous code which used them but was removed since. Even though the osx environment was added recently it's still based on the linux environment which has been there for a long time.
asl commented 2 years ago

Yes, this should be possible. jemalloc was the fastest option based on evaluations a couple of years ago but this might have changed since then.

Our (= SPAdes) experience shows that jemalloc is not fastest anymore. We recently switched to mimalloc and are pretty happy with it :)

So that clang finds the dependencies installed by conda. Same applies for pkg-config. If you can configure your system clang to search your conda environment then I think you could use that one.

Ok, I see. Will you be interested in a pull-request that switches build system to cmake? It will be responsible to dependencies handling, etc. as well.