davidemms / OrthoFinder

Phylogenetic orthology inference for comparative genomics
https://davidemms.github.io/
GNU General Public License v3.0
673 stars 186 forks source link

IndexError: list index out of range when inferring multiple sequence alignments and gene trees #736

Open pkfsantos opened 2 years ago

pkfsantos commented 2 years ago

Hi,

I successfully run orthofinder using the protein sequences, but I get an error when I run it again using the nucleotide sequences from the same species. I have DNA sequences for 71 species. The blast part and orthogroups assignment complete successfully. The issue is when it starts inferring the multiple alignments and gene trees.

The command I used was:

orthofinder -d -f /path_to_DNA_sequences/ -S blast_nucl -M msa

Below is a subset of the output file:

"(...) Writing orthogroups to file

OrthoFinder assigned 882911 genes (93.8% of total) to 49012 orthogroups. Fifty percent of all genes were in orthogroups with 65 or more genes (G50 was 65) and were contained in the largest 5715 orthogroups (O50 was 5715). There were 1639 orthogroups with all species present and 60 of these consisted entirely of single-copy genes.

2022-09-02 14:51:55 : Done orthogroups

Analysing Orthogroups

2022-09-02 14:52:02 : Starting MSA/Trees Species tree: Using 1039 orthogroups with minimum of 91.5% of species having single-copy genes in any orthogroup

Inferring multiple sequence alignments for species tree

2022-09-02 14:53:30 : Done 0 of 1039 2022-09-02 15:01:35 : Done 100 of 1039 2022-09-02 15:10:42 : Done 200 of 1039 2022-09-02 15:18:35 : Done 300 of 1039 2022-09-02 15:26:47 : Done 400 of 1039 2022-09-02 15:34:21 : Done 500 of 1039 2022-09-02 15:42:38 : Done 600 of 1039 2022-09-02 15:49:13 : Done 700 of 1039 2022-09-02 15:54:45 : Done 800 of 1039 2022-09-02 15:59:59 : Done 900 of 1039

Inferring remaining multiple sequence alignments and gene trees

2022-09-02 16:33:42 : Done 0 of 47974 Traceback (most recent call last): File "scripts_of/parallel_task_manager.py", line 209, in Worker_RunCommands_And_Move File "scripts_of/trees_msa.py", line 262, in trim_fn File "scripts_of/trim.py", line 94, in main File "scripts_of/trim.py", line 35, in init IndexError: list index out of range Traceback (most recent call last): File "scripts_of/parallel_task_manager.py", line 209, in Worker_RunCommands_And_Move File "scripts_of/trees_msa.py", line 262, in trim_fn File "scripts_of/trim.py", line 94, in main File "scripts_of/trim.py", line 35, in init IndexError: list index out of range

(...)- keeps repeating the error message and never stops the run

Any ideas about how to fix it?

Thank you, Priscila

stephen-14 commented 1 year ago

hello Priscila, I also faced the same problem like this, at that step. How did you fix it???

pkfsantos commented 1 year ago

Hi Stephen, I still didn't find a solution for this. Let me know if you eventually solve it. Best, Priscila

Felipe-ia commented 1 year ago

Hi, i have this problem too. Did you fix it?

stephen-14 commented 1 year ago

Hi everyone, Recently, I solved my problem. Actually, at first, I used the DNA seqs to use for Orthofinder. After I read the manual again "OrthoFinder is simple to use and all you need to run it is a set of protein sequence files (one per species) in FASTA format." Then, I annotated my DNA seq to Protein format, then It ran well. I hope my information is helpful to you. Keep updating. Best wishes.

On Thu, Aug 17, 2023 at 8:10 AM Felipe-ia @.***> wrote:

Hi, i have this problem too. Did you fix it?

— Reply to this email directly, view it on GitHub https://github.com/davidemms/OrthoFinder/issues/736#issuecomment-1681382228, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXWJASWAERDXC7GSEUPUORTXVVHOTANCNFSM6AAAAAAQDV2HCQ . You are receiving this because you commented.Message ID: @.***>

-- Quoc Dung NGUYEN Graduate student of Biological Science Dept. University of Ulsan 93 Daehak-ro, Namgu, Ulsan 44610, South Korea +82-(0)10-6653-1404