thegenemyers / FASTK

A fast K-mer counter for high-fidelity shotgun datasets
Other
117 stars 16 forks source link

installing via conda #10

Open KamilSJaron opened 3 years ago

KamilSJaron commented 3 years ago

Dear Gene, thanks for making FASTK!

Would you consider making FASTK available also via conda?

I tried to compile it on our cluster, run into some errors and I bet I am not the only one. Having it on conda would make the life of loads of people easier.

thegenemyers commented 3 years ago

Could you tell me what errors? There are no dependencies that I know of. The make should be very simple. -- Gene

On 5/10/21, 12:30 PM, Kamil S. Jaron wrote:

Dear Gene, thanks for making FASTK!

Would you consider making FASTK available also via conda?

I tried to compile it on our cluster, run into some errors and I bet I am not the only one. Having it on conda would make the life of loads of people easier.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/thegenemyers/FASTK/issues/10, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABUSINXCKADI3F3B24YC5ETTM6YWDANCNFSM44Q24ONQ.

KamilSJaron commented 3 years ago

I think most of the problems will come from non-canonical locations of libraries installed via conda, because when I tried on a personal linux computer it worked without any issue. This is what I tried to compile it on the cluster

  1. deleted the CC=gcc line from the makefile (that's not the name of the gcc binary installed via conda)
  2. Then during compilation of HTSLIB the compiler did not see the compression library. So I added a custom path to evironmental variables CPPFLAGS and LDFLAGS, regenerated makefile using ./configure and finally compiled HTSLIB
  3. Added -I and -L parameters to CFLAGS vairable in the FASTK Makefile, but then I got error I did not really understand
/ceph/users/kjaron/.conda/envs/smudgeplot/bin/x86_64-conda_cos6-linux-gnu-cc -I/ceph/users/kjaron/.conda/envs/smudgeplot/include -O3 -Wall -Wextra -Wno-unused-result -fno-strict-aliasing -o FastK -I./HTSLIB -L/ceph/users/kjaron/.conda/envs/smudgeplot/lib -Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now FastK.c io.c split.c count.c table.c merge.c MSDsort.c LSDsort.c libfastk.c LIBDEFLATE/libdeflate.a HTSLIB/libhts.a -lpthread -lpthread -lz -lm -lbz2 -llzma -lcurl -lcrypto                                            
split.c: In function 'Distribute_Block':                                                                                                                                           
split.c:1181:64: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits]                                                                                    
                           if (fwrite(&nlst,sizeof(int),1,nstr) < 0)                                                                                                               
                                                                ^                                                                                                                  
/ceph/users/kjaron/.conda/envs/smudgeplot/bin/../lib/gcc/x86_64-conda_cos6-linux-gnu/7.2.0/../../../../x86_64-conda_cos6-linux-gnu/bin/ld: /tmp/ccNx6O8q.o: undefined reference to symbol 'clock_gettime@@GLIBC_2.2.5'                                                                                                                                                
/ceph/users/kjaron/.conda/envs/smudgeplot/bin/../x86_64-conda_cos6-linux-gnu/sysroot/lib/librt.so.1: error adding symbols: DSO missing from command line                           
collect2: error: ld returned 1 exit status                                                                                                                                         
make: *** [Makefile:21: FastK] Error 1 

I also tried to link the program statically on the personal linux machine, but it dropped like hundreds of linking problems I agian had troubles to understand.

jmarshall commented 3 years ago

The fwrite warning is fixed by #11.

The clock_gettime errors are because your cluster has an old version of glibc for which you need to link with -lrt to use clock_gettime(). This function is used by FastK.c, so you would need to add -lrt to the link command for FastK.

There are various operating systems where -lrt is needed for this and probably none where it doesn't exist, so Gene might want to add this to the Makefile. Or not, and just add a comment nearby.

KamilSJaron commented 3 years ago

Thanks @jmarshall that was the last step I needed to get it to work!

Does the effort I devoted in installing proving my original point about conda?

jmarshall commented 3 years ago

io.c uses internal HTSlib CRAM functions, so it would be a significant amount of work to make that part of the code work with a conda-supplied debundled HTSlib. I expect the (bio)conda maintainers' preference for a fastk conda package would be for it not to bundle either HTSlib or libdeflate.

davebx commented 3 years ago

I was able, with some changes, to use the following conda recipe and script to build a fastk package:

{% set name = "FASTK" %}
{% set version = "1.0" %}
{% set sha256 = "fac24a40ac91487a15356257dd6cad35eccb932d60bbb4c40bc9a6ea394fefc2" %}

package:
  name: {{ name|lower }}
  version: {{ version }}

source:
  url: https://github.com/davebx/FASTK/archive/refs/tags/v{{ version }}.tar.gz
  sha256: {{ sha256 }}

build:
  number: 0
  skip: true  # [osx]
  detect_binary_files_with_prefix: true

requirements:
  host:
    - bzip2
    - zlib
    - libcurl==7.61.0
    - make
    - {{ compiler('cxx') }}
    - {{ compiler('c') }}

test:
  files:
    - test.fa
  commands:
    - FastK -k5 test.fa

about:
  license: Proprietary
  summary: A tool.
  home: https://github.com/thegenemyers/FASTK
export LDFLAGS="-L$SRC_DIR/HTSLIB -L$PREFIX/lib"
export CFLAGS="-I$SRC_DIR -I$SRC_DIR/HTSLIB -I$SRC_DIR/LIBDEFLATE -I$SRC_DIR/LIBDEFLATE/common -I$PREFIX/include -L$SRC_DIR/HTSLIB -L$PREFIX/lib"
export CPPFLAGS="-I$SRC_DIR -I$SRC_DIR/HTSLIB -I$SRC_DIR/LIBDEFLATE -I$SRC_DIR/LIBDEFLATE/common -I$PREFIX/include -L$SRC_DIR/HTSLIB -L$PREFIX/lib"

cd HTSLIB ; make lib-static ; cd -
cd LIBDEFLATE ; make ; cd -
make ; make install
muffato commented 2 years ago

Hi @KamilSJaron and @thegenemyers . FYI FastK is now in bioconda, https://anaconda.org/bioconda/fastk, from an unofficial source