AlexGa / Phylostratigraphy

Pipeline for Phylostratigraphy
Apache License 2.0
13 stars 4 forks source link

Error parsing tag <Iteration_query-def>;java.lang.IllegalArgumentException: Incorrect BLAST identifier #5

Closed xiangyupan closed 2 years ago

xiangyupan commented 2 years ago

Hi Alex, Thanks for this excellent tool. I want to calculate the gene age of axolotl proteins and use this script now. But errors occurred as follows. 图片 I noticed that all blast xml files have been generated but only the 'map_BLAST_PS_tables' file is empty. Then I seperately ran the java like this, an error still happens. 图片

Following your advices in #2 issue, I checked the hit_def info in my xml file, it seems right? 图片

I am very confused with this error. Could you help me? Thanks for your time and work.

Pan

AlexGa commented 2 years ago

Hi Pan,

sorry for the late response. It seems that you've missed the spaces before and after "|". When you look at the - tag in your xml files, you'll see that the id, the organisms name, and the taxonomy are separated by " | " (>GeneID | [organism_name] | [taxonomy]). In your query fasta file you only used "|" without the spaces (>GeneID|[organism_name]|[taxonomy]). That's why the jar throws an error because the BLAST identifier of your query sequence cannot be split correctly. You only need to change from "|" to " | " in your input fasta file and run the pipeline again. Alternatively, if you don't want to run the whole pipeline again, you can also change the string within the - tags and add the extra spaces before and after "|".

That should fix the parsing problem.

Best

Alex

xiangyupan commented 2 years ago

Hi Alex, Thanks for your reply. After I add the space before and after the '|', the java program works well. Then I will closed this issue. Thanks again.