kfuku52 / amalgkit

RNA-seq data amalgamation for a large-scale evolutionary transcriptomics
BSD 3-Clause "New" or "Revised" License
7 stars 1 forks source link

HDF5 dependency in kallisto #83

Closed kfuku52 closed 2 years ago

kfuku52 commented 2 years ago

Also, unrelated: Newer versions of kallisto don't produce .h5 output files anymore. Should I update sanity to be kallisto version specific, or just stop looking for .h5 files alltogether?

Originally posted by @Hego-CCTB in https://github.com/kfuku52/amalgkit/issues/80#issuecomment-946723973

kfuku52 commented 2 years ago

amalgkit should check whether HDF5 is enabled in kallisto, as long as it needs .h5 as an intermediate file.

Hego-CCTB commented 2 years ago

We don't use .h5 for anything as far as I know. Kallisto usually produces 3 files: abundance.tsv, abundance.json and abundance.h5

I believe we need the .json for updating the metadata with mapping rate and the .tsv for the quantification, of course. But .h5 is unused by amalgkit.

kfuku52 commented 2 years ago

Does it mean that sanity only checks the presence of h5 and doesn't do anything about its content?

Hego-CCTB commented 2 years ago

That is correct.

kfuku52 commented 2 years ago

Alright, we don't have to check h5 then.

Hego-CCTB commented 2 years ago

sanity will not check h5 files anymore. d6464e04d31d27e52b5a1ce11d4c6f527b050d68