artic-network / artic-ncov2019

ARTIC nanopore protocol for nCoV2019 novel coronavirus
Creative Commons Attribution 4.0 International
168 stars 166 forks source link

A fix for new artic's failure, medaka and h5py version not compatible #62

Open Shaokang123 opened 3 years ago

Shaokang123 commented 3 years ago

Hi @will-rowe ,

We have been observing errors in newest artic pipeline like below:

WARNING: Potential variant VCF contains contig b'MN908947.3' not found in BAM contigs. error: Error reading potential variants VCF file. caused by: Error accessing tid from chrom2tid data structure

We found another unsolved report issue about the same problem in artic community too https://github.com/artic-network/artic-ncov2019/issues/53

After checking the issue, we found the problem is due to incompatible version of h5py module for medaka. The h5py version (V.3.1.0 in our case) is not compatible with medaka V1.0.3, and it needs to change back to V.2.7.1 to fix the issue. If not the CHROM column in medaka output vcf will be b'MN908947.3', which would cause failures of pipeline.

pip install h5py==2.7.1

Hopefully it can help. And the best way to solve it might be to write the h5py version to environment file of artic.

Regards, Shaokang

Psy-Fer commented 3 years ago

omg THANK YOU. this has been driving me insane for the past day.

Testing this now.

Psy-Fer commented 3 years ago

I think we found the issue. We were loading in h5py libs from ~/.local/ rather than our environment.

We solved this by running with PYTHONNOUSERSITE=1 python3 This disables the ~/.local from being used, and so the actual installed 2.7.1 is used as it should be.

or just export PYTHONNOUSERSITE=1

We were getting really confused, because it was working on some machines, and not others. Mostly mine, because I have lots of other things that use h5py.

Shaokang123 commented 3 years ago

@Psy-Fer cool, I'm glad that it helped you to fix your issue :)