Open kcl58759 opened 2 years ago
Hi, depends on your input and parameters, but if you prefer to use your own alignment pipeline, it will cost less resources and be faster, here
Hi, I am trying to use my own alignment pipeline to decrease resources needed. Here is my alignment file:
module load BWA/0.7.17-GCC-8.3.0 ml SAMtools/1.10-GCC-8.3.0 ml NextPolish/1.4.0-GCCcore-8.3.0-Python-3.8.2
round=2 threads=20 read=/scratch/kcl58759/Eco_pacbio_kendall/pb_css_474/cromwell-executions/pb_ccs/c7a3dc30-7f94-40de-ac16-2445f965bfad/call-export_fasta/execution/m64060_210804_174320.hifi_reads.fasta.gz read_type=hifi mapping_option=["hifi"]="asm20" input=/scratch/kcl58759/Eco_pacbio_kendall/474.Primary.Hifi.asm/474.Primary.HiFi.asm.p_ctg.fa
for ((i=1; i<=2;i++)); do
minimap2 -ax asm20 [hifi] -t 6 /scratch/kcl58759/Eco_pacbio_kendall/474.Primary.Hifi.asm/474.Primary.HiFi.asm.p_ctg.f /scratch/kcl58759/Eco_pacbio_kendall/pb_css_474/cromwell-executions/pb_ccs/c7a3dc30-7f94-40de-ac16-2445f965bfad/call-export_fasta/execution/m64060_210804_174320.hifi_reads.fasta.gz | samtools sort - -m 2g --threads 6 -o lgs.sort.bam;
samtools index lgs.sort.bam;
ls pwd
/lgs.sort.bam > lgs.sort.bam.fofn;
python NextPolish/lib/nextpolish2.py -g /scratch/kcl58759/Eco_pacbio_kendall/474.Primary.Hifi.asm/474.Primary.HiFi.asm.p_ctg.f-l lgs.sort.bam.fofn -r hifi -p 6 -sp -o genome.nextpolish.fa;
if ((i!=2));then
mv genome.nextpolish.fa genome.nextpolishtmp.fa;
input=genome.nextpolishtmp.fa;
fi;
done;
However I keep getting the errors:
[ERROR] failed to open file '[hifi]': No such file or directory python: can't open file 'NextPolish/lib/nextpolish2.py': [Errno 20] Not a directory mv: cannot stat ‘genome.nextpolish.fa’: No such file or directory [ERROR] failed to open file '[hifi]': No such file or directory python: can't open file 'NextPolish/lib/nextpolish2.py': [Errno 20] Not a directory
Is there something I am missing?
see minimap2 manual to checkout how to run minimap2, [hifi]
is not a correct option.
I believe the issue is not with minimap but with NextPolish/lib/nextpolish2.py not being available. I cannot find the script on line and it doesn't load in with ml NextPolish/1.4.0-GCCcore-8.3.0-Python-3.8.2
Question or Expected behavior How long should it take for NextPolish to complete on a ~50Mb long read genome and what memory should I ask for? I submitted it at 90GB for 99hours and it timed out.
Operating system SLURM NextPolish/1.4.0-GCCcore-8.3.0-Python-3.8.2
GCC What version of GCC are you using? gcc version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC)
Python What version of Python are you using? You can use the command
python --version
to get it.Python 3.8.2