steineggerlab / ufcg

UFCG: Universal Fungal Core Genes
https://ufcg.steineggerlab.com
GNU General Public License v3.0
29 stars 0 forks source link

v1.0.2 augustus issue #3

Open Sebastien-Raguideau opened 1 year ago

Sebastien-Raguideau commented 1 year ago

When running ufcg (v1.0.2) profile, I get this error message: ERROR! Failed subcommand :augustus --optCfgFile=/mnt/gpfs/seb/Applications/UFCG/config/ppx.cfg --predictionStart=24228 --predictionEnd=44228 --proteinprofile=/mnt/gpfs/seb/Applications/UFCG/config/model/pro/RPB2.hmm /home/sebr/seb/Database/Fungus_DB/UFCG/tmp/Lobtra1/tmp/Lobtra1/Lobtra1_scaffold_92.fna > /home/sebr/seb/Database/Fungus_DB/UFCG/tmp/Lobtra1/tmp/Lobtra1/Lobtra1_scaffold_92_p34228_RPB2.gff

I checked and this same genome dealt with normaly in ufcg v1.0.1. I have now to have ufcg (v1.0.1) for profiling and ufcg (v1.0.2) for building tree (it does not work in v1.0.1).

endixk commented 1 year ago

Could you please give me extra information by answering following questions:

Sebastien-Raguideau commented 1 year ago

I just realised that I need to change my ufcg env to upgrade augustus. I didn't test it yet and will tel you if error persist after update. Regarding v1.0.1 tree module issue, it was hanging for more than 1 hour without doing anything. Keyboard interrupt would unfreeze it and also exit it. For instance if I made a mistake in argument, it would not tell me, just hang and then after ctrl+c would print a message about it and exit. Same thing when calling ufcg with correct argument.

Sebastien-Raguideau commented 1 year ago

So I upgraded augustus to 3.5.0, the issue persists. To answer you question, yes there are successful augustus run before the failure. No problem occurs when using v1.0.1. I joined one of the genome for which annotation fail. Lobtra1.fasta.gz

endixk commented 1 year ago

Thanks to your example file, I realized that the FASTA parsing algorithm is flawed 😅

The program is supposed to extract the FASTA entries with identical names detected by fastBlockSearch, but currently the code uses substring matching for some reason. If the FASTA headers have common prefix as your example, the program gets confused and fails to extract the precise sequence blocks, which eventually ends up with the error.

This behavior will soon be fixed in the next release.

endixk commented 1 year ago

This issue should be fixed from the v1.0.3 release. Please let me know if this persists.

AdamVS commented 5 months ago

I have also had this issue with some files passing in v1.0.5, and others not. It was fixed by sorting and renaming the scaffolds. Possibly also by setting a minimum length