Closed camilae86 closed 1 week ago
Hi!
With which version of ppanggolin did you get this error?
Last time I saw this error, it was related to a full disk in the "/tmp" folder. This can happen if your pangenome is pretty big (multiple thousands of genomes) and/or if the "/tmp" disk space is very small (a few Gb).
If your problem is neither, could you share the complete log, and eventually the input that you used if that is possible for you?
I hope this helps! Adelme
Thanks @axbazin,
You were right, the size of the server is too small to carry out this project. So, we tried in an bigger Amazon server, but we face another problem. We were able to install the program in a conda environment with Python 3.8 (called milagro), however when we run the ppanggolin code annotate --anno ORGANISM_ANNOTATION_LIST --fasta ORGANISM_FASTA_LIST, we get the following error:
2024-09-02 23:57:15 utils.py:l168 INFO Command: /home/ubuntu/miniconda3/envs/milagro/bin/ppanggolin annotate --anno anotacion_Burkholderia_2024_amazon_4.txt --fasta genomas_Burkholderia_2024_amazon_4.txt
2024-09-02 23:57:15 utils.py:l169 INFO PPanGGOLiN version: 2.1.1
2024-09-02 23:57:15 annotate.py:l1047 INFO Reading anotacion_Burkholderia_2024_amazon_4.txt the list of genome files ...
0%| | 0/4 [00:00<?, ?file/s]2024-09-02 23:57:15 genome.py:l461 WARNING Contig length is unknown
25%|███████████████████████████ | 1/4 [00:00<00:00, 9.45file/s]
2024-09-02 23:57:16 genome.py:l461 WARNING Contig length is unknown
2024-09-02 23:57:16 genome.py:l461 WARNING Contig length is unknown
2024-09-02 23:57:16 genome.py:l461 WARNING Contig length is unknown
concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/milagro/lib/python3.8/site-packages/ppanggolin/annotate/annotate.py", line 969, in read_anno_file
org, has_fasta = read_org_gff(organism_name, filename, circular_contigs, pseudo, translation_table)
File "/home/ubuntu/miniconda3/envs/milagro/lib/python3.8/site-packages/ppanggolin/annotate/annotate.py", line 793, in read_org_gff
correct_putative_overlaps(org.contigs)
File "/home/ubuntu/miniconda3/envs/milagro/lib/python3.8/site-packages/ppanggolin/annotate/annotate.py", line 911, in correct_putative_overlaps
if gene.stop > len(contig):
TypeError: 'NoneType' object cannot be interpreted as an integer
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/milagro/lib/python3.8/concurrent/futures/process.py", line 239, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
File "/home/ubuntu/miniconda3/envs/milagro/lib/python3.8/site-packages/ppanggolin/annotate/annotate.py", line 971, in read_anno_file
raise Exception(f"Reading the gff3 file '{filename}' raised an error. {err}")
Exception: Reading the gff3 file '/home/ubuntu/mariac_pangenomas/R_18628.gff' raised an error. 'NoneType' object cannot be interpreted as an integer
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/ubuntu/miniconda3/envs/milagro/bin/ppanggolin", line 8, in <module>
sys.exit(main())
File "/home/ubuntu/miniconda3/envs/milagro/lib/python3.8/site-packages/ppanggolin/main.py", line 177, in main
ppanggolin.annotate.launch(args)
File "/home/ubuntu/miniconda3/envs/milagro/lib/python3.8/site-packages/ppanggolin/annotate/annotate.py", line 1235, in launch
read_annotations(pangenome, args.anno, cpu=args.cpu, pseudo=args.use_pseudo,
File "/home/ubuntu/miniconda3/envs/milagro/lib/python3.8/site-packages/ppanggolin/annotate/annotate.py", line 1076, in read_annotations
org, flag = future.result()
File "/home/ubuntu/miniconda3/envs/milagro/lib/python3.8/concurrent/futures/_base.py", line 444, in result
return self.__get_result()
File "/home/ubuntu/miniconda3/envs/milagro/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
raise self._exception
Exception: Reading the gff3 file '/home/ubuntu/mariac_pangenomas/R_18628.gff' raised an error.
Thanks for your help...
Hi,
So, this is likely related to an unexpected formatting of one of your gff3 file (likely R_18628.gff).
If you wish you can add it to the issue and I can take a look, but overall we try to follow the specifications indicated here: https://github.com/The-Sequence-Ontology/Specifications/blob/master/gff3.md
I see that there are warnings about contig length being unknown, so maybe it's related to the contig feature (or lack of thereof) in your gff3 file? Though, without an example I can only guess.
Adelme
Hi,
I hope you managed to find a solution to your problem. Closing for now. If this issue is still a thing feel free to re-open it.
Adelme
Hi! I have this error during clustering (ppanggolin cluster -p pangenome.h5):
raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['mmseqs', 'createdb', '/tmp/tmp84p8ohw5/nucleotid_sequences', '/tmp/tmp84p8ohw5/nucleotid_sequences_db']' returned non-zero exit status 1.