Open ChristophePatterson opened 2 years ago
Hi Christiophe,
This seems to me more a slurm setting up problem than a mitohifi problem. Have you run your installation with the test dataset outside slurm to see if it runs ok?
Best regards, M
Em qui., 13 de out. de 2022 às 06:31, ChristophePatterson < @.***> escreveu:
Hi,
Thanks for writing a great programme. I am sadly running into an issue which I have so far been unable to resolve or track down the precise cause. I am using MitoHifi on a HPC that uses the slurm. (specifically - https://www.dur.ac.uk/arc/hamilton/).
MitoHifi begins running on both the test data and my draft genome, however it fails when attempting step 4 (circularize, annotate and rotate each filtered contig).
The full output from mitohifi is as follows.
Looking for mitochondrion for Phalera bucephala Mito for the same species is not found Looking among close species output is written to data/NC_066711.1.[gb,fasta] 2022-10-12 15:50:45 [INFO] Welcome to MitoHifi v2. Starting pipeline... 2022-10-12 15:50:45 [DEBUG] Running MitoHiFi on debug mode. 2022-10-12 15:50:45 [INFO] Length of related mitogenome is: 16668 bp 2022-10-12 15:50:45 [INFO] Number of genes on related mitogenome: 37 2022-10-12 15:50:45 [INFO] Running MitoHifi pipeline in contigs mode... 2022-10-12 15:50:45 [INFO] 1. Fixing potentially conflicting FASTA headers 2022-10-12 15:50:45 [INFO] 2. Let's run the blast of the contigs versus the close-related mitogenome 2022-10-12 15:50:45 [INFO] 2.1. Creating BLAST database: 2022-10-12 15:50:45 [INFO] makeblastdb -in data/NC_066711.1.fasta -dbtype nucl 2022-10-12 15:50:46 [INFO] Makeblastdb done. 2022-10-12 15:50:46 [INFO] 2.2. Running blast of contigs against close-related mitogenome: 2022-10-12 15:50:46 [INFO] blastn -query test.fa -db data/NC_066711.1.fasta -num_threads 1 -out contigs.blastn -outfmt 6 std qlen slen 2022-10-12 15:50:53 [INFO] Blast done. 2022-10-12 15:50:53 [INFO] 3. Filtering BLAST output to select target sequences 2022-10-12 15:50:53 [INFO] Filtering thresholds applied: 2022-10-12 15:50:53 [INFO] Minimum query percentage = 50 2022-10-12 15:50:53 [INFO] Minimum query length = 80% subject length 2022-10-12 15:50:53 [INFO] Maximum query length = 5 times subject length 2022-10-12 15:50:54 [INFO] Filtering BLAST finished. A list of the filtered contigs was saved on ./contigs_filtering/contigs_ids.txt file 2022-10-12 15:50:54 [INFO] 4. Now we are going to circularize, annotate and rotate each filtered contig. Those are potential mitogenome(s). 2022-10-12 15:50:54 [DEBUG] Threads per contig=1 2022-10-12 15:50:54 [DEBUG] Thresholds for circularization: circular size=220 | circular offset=40 2022-10-12 15:50:54 [DEBUG] Thresholds for annotation (MitoFinder): maximum contig size=83340 2022-10-12 15:50:54 [INFO] Working with contig tig00007550_1 2022-10-12 15:50:54 [INFO] Working with contig tig00007572_1 2022-10-12 15:50:54 [INFO] Started tig00007550_1 circularization 2022-10-12 15:50:54 [INFO] Started tig00007572_1 circularization 2022-10-12 15:50:55 [INFO] tig00007572_1 circularization done. Circularization info saved on ./potential_contigs/tig00007572_1/tig00007572_1.circularisationCheck.txt 2022-10-12 15:50:55 [INFO] Started tig00007572_1 (MitoFinder) annotation 2022-10-12 15:50:56 [INFO] tig00007550_1 circularization done. Circularization info saved on ./potential_contigs/tig00007550_1/tig00007550_1.circularisationCheck.txt 2022-10-12 15:50:56 [INFO] Started tig00007550_1 (MitoFinder) annotation 2022-10-12 15:50:56 [INFO] tig00007550_1 annotation done. Annotation log saved on ./potential_contigs/tig00007550_1/tig00007550_1.annotation_MitoFinder.log /nobackup/dbl0hpc/apps/MitoHiFi/parallel_annotation.py:51: UserWarning: Contig tig00007550_1 does not have an annotation file, check MitoFinder's log warnings.warn("Contig "+ contig_id + " does not have an annotation file, check MitoFinder's log") 2022-10-12 15:50:56 [INFO] tig00007572_1 annotation done. Annotation log saved on ./potential_contigs/tig00007572_1/tig00007572_1.annotation_MitoFinder.log /nobackup/dbl0hpc/apps/MitoHiFi/parallel_annotation.py:51: UserWarning: Contig tig00007572_1 does not have an annotation file, check MitoFinder's log warnings.warn("Contig "+ contig_id + " does not have an annotation file, check MitoFinder's log") Traceback (most recent call last): File "/nobackup/dbl0hpc/apps/MitoHiFi/mitohifi.py", line 378, in
main() File "/nobackup/dbl0hpc/apps/MitoHiFi/mitohifi.py", line 264, in main tRNA_ref = fetch.get_ref_tRNA() File "/nobackup/dbl0hpc/apps/MitoHiFi/fetch.py", line 41, in get_ref_tRNA reference_tRNA = max(tRNAs, key=tRNAs.get) ValueError: max() arg is an empty sequence The line 2022-10-12 15:50:55 [INFO] tig00007572_1 circularization done. Circularization info saved on ./potential_contigs/tig00007572_1/tig00007572_1.circularisationCheck.txt states that the .circularisationCheck.txt file is created within the relative file path ./potential_contigs/tig00007572_1/. But no file path exists and the file can be found within the current working directory. The working directory upon termination of the programme is as follows.
contigs.blastn tig00007550_1.mito.fa.nsq tig00007572_1.mito.fa.nhr contigs_ids.txt tig00007550_1.mito.fa.ntf tig00007572_1.mito.fa.nin data tig00007550_1.mito.fa.nto tig00007572_1.mito.fa.njs NC_016067.1.fasta tig00007550_1.mitogenome.fa tig00007572_1.mito.fa.notNC_016067.1.gb tig00007550_1.mitogenome.fa.ndb tig00007572_1.mito.fa.nsq parsed_blast_all.txt tig00007550_1.mitogenome.fa.nhr tig00007572_1.mito.fa.ntf parsed_blast.txt tig00007550_1.mitogenome.fa.nin tig00007572_1.mito.fa.nto test.fa tig00007550_1.mitogenome.fa.njs tig00007572_1.mitogenome.fa tig00007550_1.circularisationCheck.txt tig00007550_1.mitogenome.fa.not tig00007572_1.mitogenome.fa.ndb tig00007550_1.circularization_check.blast.tsv tig00007550_1.mitogenome.fa.nsq tig00007572_1.mitogenome.fa.nhr tig00007550_1.mito.fa tig00007550_1.mitogenome.fa.ntf tig00007572_1.mitogenome.fa.nin tig00007550_1.mito.fa.ndb tig00007550_1.mitogenome.fa.nto tig00007572_1.mitogenome.fa.njs tig00007550_1.mito.fa.nhr tig00007572_1.circularisationCheck.txt tig00007572_1.mitogenome.fa.not tig00007550_1.mito.fa.nin tig00007572_1.circularization_check.blast.tsv tig00007572_1.mitogenome.fa.nsq tig00007550_1.mito.fa.njs tig00007572_1.mito.fa tig00007572_1.mitogenome.fa.ntf tig00007550_1.mito.fa.not tig00007572_1.mito.fa.ndb
I'm also noting that the file contigs_ids.txt is also not found within the relative filepath ./contigs_filtering/contigs_ids.txt as stated in the above console output. I can also not locate any log file. I seem to be in the weird situation that mitohifi is not saving files at the relative file path stated, but then can't find those files when it searches for them.
Any advice or guidance would be greatly appreciated.
Many thanks,
Christophe
— Reply to this email directly, view it on GitHub https://github.com/marcelauliano/MitoHiFi/issues/27, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7M5RBUUX7CMD6KCNY3SQDWC7JG7ANCNFSM6AAAAAARECQCNY . You are receiving this because you are subscribed to this thread.Message ID: @.***>
-- Marcela Uliano da Silva, PhD
Senior Bioinformatician - Wellcome Sanger Institute Darwin Tree of Life Project
Churchill College Cambridge By-Fellow
Cambridge, UK
Hi Marcela,
Thanks I'm chatting with the folks who run the HPC (a meeting tomorrow) so hopefully if this is a slurm issue we can resolve it. However, I ran the code directly on the console (outside of slurm) and have arrived at the same error.
Kind regards, Christophe
Hi christophe,
Are you running inside a singularity or with your own installation? Did you run with the test dataset?
Em qui., 13 de out. de 2022 às 08:07, ChristophePatterson < @.***> escreveu:
Hi Marcela,
Thanks I'm chatting with the folks who run the HPC (a meeting tomorrow) so hopefully if this is a slurm issue we can resolve it. However, I ran the code directly on the console (outside of slurm) and have arrived at the same error.
Kind regards, Christophe
— Reply to this email directly, view it on GitHub https://github.com/marcelauliano/MitoHiFi/issues/27#issuecomment-1277435900, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7M5RHBF5SGUBYJYUJEP3TWC7UQRANCNFSM6AAAAAARECQCNY . You are receiving this because you commented.Message ID: @.***>
-- Marcela Uliano da Silva, PhD
Senior Bioinformatician - Wellcome Sanger Institute Darwin Tree of Life Project
Churchill College Cambridge By-Fellow
Cambridge, UK
Hi Marcela,
Apologies for the delay. I have managed to get mitohifi v2.2 working on the HPC I have access to. For others that may be in a similar position, I am attaching my code below. The route of this cause seems to be something to do with mitohifi defaulting to writing the file output to $HOME directory and changing this to another directory proved challenging. Potentially adding an -outfolder
to the mitohifi.py pipeline (similar to that seen in the findMitoReference.py) may be of benefit. I was also struggling to locate the exact cause of the error because the stderr=subprocess.DEVNULL
was included within the mitohifi.py code. Specifically for myself because mitohifi was searching for the fasta files in $HOME (regardless of anything I did) minimap2 was not finding the specified fasta file. This would have outputted an error normally but when running this through mitohifi it continues on with no error and crashes later in the Hifiasm step because there were no mapped reads.
This may be specific to my HPC set up but if anyone else is using a slurm based system, I have successfully run mitohifi using the following code.
# Open an interactive slurm job
srun -t 8:00:00 -c 24 --mem=50G --gres=tmp:50G --pty bash
# Open an interactive singularity image. For myself it was essential to have both --bind and --home. Even if they were the same
singularity run --bind /path/to/hifireads/ --home /path/to/output/directory /path/to/singularity/mitohifi_2.2_cv1.sif bash
# Run mitohifi within the singularity image
mitohifi.py -r hifi_reads.fastq -f fasta_file.fasta -g gb_file.gb -t 24 -o 1
Kind regards,
Christophe
Hi,
Thanks for writing a great programme. I am sadly running into an issue which I have so far been unable to resolve or track down the precise cause. I am using MitoHifi on a HPC that uses the slurm. (specifically - https://www.dur.ac.uk/arc/hamilton/).
MitoHifi begins running on both the test data and my draft genome, however it fails when attempting step 4 (circularize, annotate and rotate each filtered contig).
The full output from mitohifi is as follows.
The line
2022-10-12 15:50:55 [INFO] tig00007572_1 circularization done. Circularization info saved on ./potential_contigs/tig00007572_1/tig00007572_1.circularisationCheck.txt
states that the.circularisationCheck.txt
file is created within the relative file path./potential_contigs/tig00007572_1/
. But no file path exists and the file can be found within the current working directory. The working directory upon termination of the programme is as follows.I'm also noting that the file
contigs_ids.txt
is also not found within the relative filepath./contigs_filtering/contigs_ids.txt
as stated in the above console output. I can also not locate any log file. I seem to be in the weird situation that mitohifi is not saving files at the relative file path stated, but then can't find those files when it searches for them.Any advice or guidance would be greatly appreciated.
Many thanks,
Christophe