yoshihikosuzuki / ant-asm-workflow

Genome assembly workflow with HiFi + Omni-C
1 stars 1 forks source link

run_hifiasm.sh with "--hom-cov" argument fails #3

Closed AlesBucek closed 2 years ago

AlesBucek commented 2 years ago

Hi Yoshi, I modified the run_hifiasm.sh to add the "--hom-cov 148" argument for the sample where homozygous coverage was inferred incorrectly by hifiasm. I also changed the $OUT_PREFIX variable so the outputs of the previous hifiasm run are not overwritten:

#!/bin/bash
#SBATCH -J hifiasm_homcov
#SBATCH -o hifiasm_homcov.log
#SBATCH -p compute
#SBATCH -n 1
#SBATCH -N 1
#SBATCH -c 128
#SBATCH --mem=500G
#SBATCH -t 24:00:00
shopt -s expand_aliases && source ~/.bashrc && set -e || exit 1
source ../../config.sh

IN_FASTX=hifi.fastq
N_THREADS=128

#OUT_PREFIX=$(basename ${IN_FASTX} .gz)
#OUT_PREFIX=${OUT_PREFIX%.*}.hifiasm
OUT_PREFIX=hifi.hifiasm.homcov

ml ${_HIFIASM} ${_GFATOOLS} ${_SEQKIT}

hifiasm -o ${OUT_PREFIX} -t ${N_THREADS} --hom-cov 148 ${IN_FASTX}
for DATA in *tg.gfa; do
    gfatools gfa2fa ${DATA} > ${DATA%.gfa}.fasta
done

echo "Contig stats (${OUT_PREFIX}.bp.p_utg.fasta):"
seqkit stats -a ${OUT_PREFIX}.bp.p_utg.fasta
echo "Contig stats (${OUT_PREFIX}.bp.p_ctg.fasta):"
seqkit stats -a ${OUT_PREFIX}.bp.p_ctg.fasta

if [ "$AUTO_DEL" = "true" ]; then
    source ./remove_tmp_files.sh
fi

However, this gives me an error:

[ERROR] unknown option in "--hom-cov"
/var/spool/slurmd/job14908846/slurm_script: line 22: 3642098 Segmentation fault      (core dumped) hifiasm -o ${OUT_PREFIX} -t ${N_THREADS} --hom-cov 148 ${IN_FASTX}

I tried different positions of the "--hom-cov" parameter in the command in case it matters but it did not fix the error. Could you advice me what might be wrong? Thanks! Ales

yoshihikosuzuki commented 2 years ago

Hi Ales, I guess the version of hifiasm you are using is Other/hifiasm/0.15.4, which does not have that option yet. The latest v0.16.1 should accept the option. Can you change one line in your config.sh from

_HIFIASM=Other/hifiasm/0.15.4

to

_HIFIASM=Other/hifiasm/0.16.1

and then re-running the script? If you are already using v0.16.1 or it still throws an error, then please let me know.

(Edit: Just FYI, the difference in the two hifiasm versions is very subtle, and in most cases the resulting assembly does not change at all. That is, you do not have to re-run hifiasm for the other samples you have already run it unless you care about the consistency of the software version so much.)

AlesBucek commented 2 years ago

Hi Yoshi, thanks, that fixed both the error and my incompletely purged assembly - the new assembly has low BUSCO duplicates and expected total size. FYI, I rerun some assemblies with fixed hom-cov and the differences compared to the previous assembly with automatic hom-cov and previous version of hifiasm were marginal.

Not sure if you have plans to keep updating the workflow but if so, versioning it (e.g. by showing a version of the workflow in config.sh) might be useful. Cheers, Ales

yoshihikosuzuki commented 2 years ago

Hi Ales, Glad to hear that. Yes, versioning (and making tagged releases) should be done. Will do in the near future...