Open pstrope opened 1 year ago
Hello,
I will start with a comment for users interested in fungal protein-coding gene prediction.
The GeneMark-ETP algorithm, as it was published in 2023 in bioRxiv preprint, is designed to find genes in eukaryotic species. Fungi are eukaryotes, and GeneMark-ETP can be used for fungi gene prediction as-is.
In 2008, we published a fungi-specific gene-finding algorithm, GeneMark-ES-fungi https://pubmed.ncbi.nlm.nih.gov/18757608/, which demonstrated better accuracy than general eukaryotic gene finders on fungal species. An increase in accuracy was reached by improved modeling of intron branching point in fungal species.
The fungi branch point model was recently incorporated into GeneMark-ETP. By providing the optional command line parameter "--fungus" to GeneMark-ETP, users are switching the ETP algorithm to a novel and unpublished development, GeneMark-ETP-fungi.
Both GeneMark-ETP and GeneMark-ETP-fungi can be used for gene prediction in fungi. The fungal version of ETP is expected to be more accurate.
If there is some issue or error with the novel GeneMark-ETP-fungi algorithm, please report it to us. While waiting for our response, you may use the general GeneMark-ETP on fungal species.
Now, let's return to the reported issue: GeneMark-ETP failed to run on fungal species.
The reported failure happened before the fungi-specific block in GeneMark-ETP-fungi. This error is more general, and the GeneMark-ETP without the "--fungus" option should fail in the same location.
Similar ETP failure was tracked down to the issue with input protein file formatting: https://github.com/Gaius-Augustus/BRAKER/issues/577#issuecomment-1452511938
See comment by "JohnUrban commented on Mar 2". "And looking more into that, for some reason many of the OrthoDB protein sequences end with a period... e.g.:"
We accounted for possible errors in user input files and improved the stability of the ETP code: https://github.com/gatech-genemark/GeneMark-ETP/issues/6
Please check if there is a protein file formatting issue on your side, too. If it is, fix the protein file, update the ETP to the latest version, and try ETP again.
Thank you for the info, Alex
Hi,
I am testing GeneMark-ETP on fungal genomes. Haven't been successful. Does this work on fungi yet? The error I get is