sahammond / gnavigator

Use cDNA alignments with or without genetic map information to evaluate the completeness and correctness of a de novo genome or transcriptome assembly.
GNU General Public License v3.0
3 stars 2 forks source link

Genetic map assessment uses non-complete cDNAs? #12

Closed lcoombe closed 5 years ago

lcoombe commented 5 years ago

Hi Austin,

Hope you're well! I was working on slightly changing how we report the GM summary from the cDNA perspective, so I was taking a look at some examples. I thought that only 'Complete' cDNAs were used for the genetic map assessment, but it doesn't look like that's the case?

Example -

[lcoombe@gphost06 tmp]$ grep "WS03215_F02" WS-v1-tigmint-ARCS-Sealer-cobbler-rails-abyssLS_tigmint_1000plus-full-cDNA-results-table.tsv
WS03215_F02     Poorly mapped   2401956 LG02
[lcoombe@gphost06 tmp]$ grep "WS03215_F02" Pglauca_geneticMap_v2.GCATfilter.tsv 
LG02    80.51   WS03215_F02
[lcoombe@gphost06 tmp]$ grep "WS03215_F02" WS-v1-tigmint-ARCS-Sealer-cobbler-rails-abyssLS_tigmint_1000plus-full-genetic-map-results-table.tsv 
2401956 GQ04007_P20 WS03215_F02 Same LG, expected order LG02

This cDNA is in the GM, but it is 'poorly mapped'. But, we do still use that cDNA for the GM assessment.

I'm just wondering if that was what was intended, and what the rationale for that is? I was wondering if including these non-complete cDNAs could prove to be an issue?

Thank you! Lauren

sahammond commented 5 years ago

Hi Lauren,

Hmm, that's unexpected behaviour. Only complete, single-copy cDNAs are supposed to be reported in the GM results table. It is possible for a cDNA to have lower-quality alignments in addition to a complete one (though these aren't reported, or really checked-for since it says 'good enough' after finding one or more complete alignments), but your example suggests that there is no complete alignment for WS03215_F02 at all.

Is it possible that Gnavigator is reading from a different alignment file than intended? Its autodetect function is pretty simplistic.

I'll send you a link to a spot where you can share the outputs :)

Thanks! Austin

sahammond commented 5 years ago

Thanks for sharing your files! Gnavigator was erroneously checking the LG and order of all uniquely-aligned cDNAs found in the genetic map instead of just the complete ones. The latest commit (2c5b6c0) should enact the intended behaviour. Please let me know if it works for you, or if the new results don't look right.

Thanks, Austin

lcoombe commented 5 years ago

Thank you so much for this fix, Austin! I did a test run and did a couple of spot checks, and everything looks good now.