Open sharifas opened 4 years ago
Hi, could you provide an example file for testing?
Hi, you are probably right. This part is mean to be used with codon-based alignments, somthing like this http://etetoolkit.org/cookbook/ete_build_mixed_types.ipynb
However we should fix ETE to account for these cases.
thanks for reporting!
Hi,
I also have this problem when running ete3 evol using fasta files (primate genomes).
Command: ete3 evol -t $ANALYSIS_FOLDER/Genes/$gene/species_tree.nw --alg $ANALYSIS_FOLDER/Genes/$gene/$gene-clean.fa -o $ANALYSIS_FOLDER/Genes/$gene/$gene-evol-branch --models ${models[]} --tests ${tests[]} --cpu 6 --mark ${marks[*]} >> $ANALYSIS_FOLDER/Genes/$gene/$gene.ete-branch.log
Error log file: ** Running ete evol for...IGLJ3 ** Traceback (most recent call last): File "/rds/general/user/cm1118/home/anaconda3/envs/ete3_env/lib/python3.6/site-packages/ete3-3.1.1-py3.6.egg/ete3/evol/utils.py", line 159, in translate proteinseq += gencode[sequence[n:n+3]] KeyError: 'G'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/rds/general/user/cm1118/home/anaconda3/envs/ete3_env/bin/ete3", line 11, in
Example sequences from IGLJ3 fasta
dasNov3 CTGAGTAGACCCAGCCTGGG-CAGGGGCTTATACTTCCTCCATCACAGCTGCAGTGGGGG-AGG-GGCAGGGGCATCACAGGGAGGGTTTTTGTACGAGCCTGAATCACTGTGTTGGGTGTTCGGTGGAGGGACCCAGCTGACCGTCCTAG eulFla1 ---------------------------------CTTCCTCCAGCACAGCTGCAGCTGGGGCTGGAGCTG--GGGGTCTCGGGGAGGGTTTTTGTACGAGCCTGTGTCACTGTGTTGGGTGTTCGGCGGCGGGACCAAGCTGACCGTCCTAG eulMac1 ---------------------------------CTTCCTCCAGCACAGCTGCAGCTGGGGCTGGAGCTG--GGGGTCTCGGGGAGGGTTTTTGTACGAGCCTGTGTCACTGTGTTGGGTGTTCGGCGGCGGGACCAAGCTGACCGTCCTAG gorGor5 ATGAGCAGATGCCACCAGGGCCACTGGCCCCAGCTTCCTCCTTCACAGCTGCAGTGGGGGCTGGGGCTAGGGGCATCCCAGGGAGGGTTTTTGTATGAGCCTGTGTCACAGTGTTGGGTGTTCGGCGGAGGGACCAAGCTGACCGTCCTAG
For now I have decided to leave out the genes outputting this error from my analyses, hoping to re-run once the problem is fixed.
Thank you, Kitty
I am using the EvolTree to do dn/ds calcultions. I have a iphylip file of coding sequences and I am getting the following error: "in translate for nt2 in newcod[1]: IndexError: list index out of range " I checked the 'ete/ete3/evol/utils.py' file and I see that the gencode dictionary handles full gaps "---" and converts them to "-" for the protein sequence but I don't think it handles partial gaps such as "A--"