weberlab-hhu / Helixer

Using Deep Learning to predict gene annotations
GNU General Public License v3.0
164 stars 27 forks source link

Masked or non-masked genomes #99

Open GoliczGenomeLab opened 1 year ago

GoliczGenomeLab commented 1 year ago

Hi, We have a large genome (13Gb) mostly composed of repetitive elements. Does Helixer recognize soft masking or does some internal masking of its own? All the best, Agnieszka

alisandra commented 1 year ago

Hi Agnieszka,

Thanks for your question. Helixer does not recognize soft masking. It will recognize hard masking.

In my experience Helixer performs well, particularly for an ab initio caller, at predicting intergenic despite repetitive regions and without masking. That said, there is currently one major exception / issue, where it over predicts on fragmented contigs and sequences ends, which we'd like to get fixed, but haven't yet; just wanted to give you a quick heads up about that.

All the best, Alisandra

agolicz commented 1 year ago

Dear Alisandra, Many thanks for your reply. Sorry I did not see it earlier (I did not notice I left the comment logged in with a group not personal account!). We do see a bit more genes predicted compared to official annotation. We are looking into those. Overall, I love the idea behind Helixer. Thanks for a great tool!

All the best, Agnieszka