sii-companion / companion

A genome annotation tool for more than just protists
https://companion.ac.uk/
ISC License
6 stars 3 forks source link

Prevent genometools Assertion Error when calling remove_leaf #38

Closed haessar closed 1 year ago

haessar commented 1 year ago

An on-going problem which affected the generation of circos plots was traced back to an assertion error being raised during split_splice_models_at_gaps process, specifically at https://github.com/iii-companion/companion/blob/master/bin/split_genes_at_gaps.lua#L94. I dealt with this by allowing the circos processes to fail silently so as not to derail the entire pipeline - a "sticking plaster" approach.

It has recently become apparent that the same assertion errors might be causing greatly reduced annotations in the final GFF3 output. Compare https://companion.gla.ac.uk/jobs/6d34bf395da0e4840e772c6e and https://companion.gla.ac.uk/jobs/1de42786d7fdd57ff6951caa, which have similar assemblies and same parametrisation but vastly different number of genes in output.

I opened a genometools ticket to enquire whether the Assertion error was indeed a bug, and a PR is currently active which will ensure that a Lua runtime error is thrown instead of the Assertion error from C.

The reason that the Assertion error is being triggered on the Companion side is because the call to remove_leaf is sometimes being made to a feature node that isn't a leaf (i.e. an mRNA feature which has child CDS features). This simple fix to Companion will ensure that any such children are first removed before the mRNA feature is subsequently removed, and so the split_genes_at_gaps.lua will not halt execution early as it was previously doing.

haessar commented 1 year ago

MAJOR issue in production - all jobs failing with CIRCOS error

haessar commented 1 year ago

Happy with tests - ready to merge.