sahammond / gnavigator

Use cDNA alignments with or without genetic map information to evaluate the completeness and correctness of a de novo genome or transcriptome assembly.
GNU General Public License v3.0
3 stars 2 forks source link

Option for full report of genetic map results #4

Closed lcoombe closed 6 years ago

lcoombe commented 6 years ago

Hi Austin, Just thinking that it would also be useful to have a 'full' genetic map output in addition to the summary outputs.You have this for the cDNA classifications, but it would be useful to know the identities of the cDNAs that map to the same scaffold, how that scaffold was classified, and perhaps the order that the cDNAs were observed on that scaffold. Not critical, just thought it would be useful as we figure out the best way to use the genetic map! Thanks! Lauren

sahammond commented 6 years ago

Good idea! That should be pretty quick to implement. I'll do that along with adding wiggle room for cases where the cDNAs have the same apparent location in the map.

sahammond commented 6 years ago

I decided to write the full table of results automatically, just like for the cDNAs themselves, and push the commit now rather than wait for the cDNA wiggle tweak.

Feature added in bff916b.

Please let me know if you think a different format or column organization would be better.

lcoombe commented 6 years ago

Thanks Austin!

So I just gave commit bff916b a try using one of the WS assemblies that I already had the gmap alignments for - I'm getting this error:

[lcoombe@hpce704 gnav_commitbff916b]$ ~/miniconda2/envs/numbers/bin/python /projects/btl/lcoombe/git/gnavigator/gnavigator.py -p WS-v1-tigmint-ARCS-Sealer_cobbler-rails-a500s0.99_1000plus -m LM3-work-version-Feb2014_Jean_to_Inanc.txt GCAT_WS-3.3.cluseq.noGaps.simple.fa WS-v1-tigmint-ARCS-Sealer_cobbler-rails-a500s0.99_1000plus.fa 

=== Skipping GMAP alignment stage ===
Gnavigator found pre-existing GMAP alignment results. Will use the following files:
/projects/spruceup_scratch/pglauca/WS77111/analyses/gnavigator/WS-v1-tigmint-ARCS-Sealer_cobbler-rails/gnav_commitbff916b/WS-v1-tigmint-ARCS-Sealer_cobbler-rails-a500s0.99_1000plus.uniq
/projects/spruceup_scratch/pglauca/WS77111/analyses/gnavigator/WS-v1-tigmint-ARCS-Sealer_cobbler-rails/gnav_commitbff916b/WS-v1-tigmint-ARCS-Sealer_cobbler-rails-a500s0.99_1000plus.mult
/projects/spruceup_scratch/pglauca/WS77111/analyses/gnavigator/WS-v1-tigmint-ARCS-Sealer_cobbler-rails/gnav_commitbff916b/WS-v1-tigmint-ARCS-Sealer_cobbler-rails-a500s0.99_1000plus.transloc
Current time: 2018-03-08 09:18:43

=== GNAVIGATOR cDNA RESULTS ===
11207 (41.29%) complete sequences
4189 (15.43%) duplicated sequences
3490 (12.86%) fragmented sequences
2750 (10.13%) partial sequences
4123 (15.19%) poorly mapped sequences
1384 (5.1%) missing sequences
27143 (100.0%) sequences were evaluated
Current time: 2018-03-08 09:19:53
Traceback (most recent call last):
  File "/projects/btl/lcoombe/git/gnavigator/gnavigator.py", line 507, in <module>
    for t in LG_table_formatter(res):
  File "/projects/btl/lcoombe/git/gnavigator/gnavigator.py", line 219, in LG_table_formatter
    outbuff = "\t".join([nam, cDNA, stat])
TypeError: sequence item 0: expected string, numpy.int64 found

Looks like "nam" is expected to be a string, but isn't?

sahammond commented 6 years ago

Ah, that happens when scaffold IDs are solely numeric. I forgot to add the format conversion that I do elsewhere to handle those cases. Working on the fix now. Should be fixed by c4852a8

lcoombe commented 6 years ago

Looks good!

# Scaffold      cDNA IDs        Status
14      GQ0033_B03 GQ03718_G13  Different LG
1796    GQ03210_I03 GQ0183_C03  Different LG
1756    GQ03612_N16 GQ0187_B07  Different LG
48      GQ04005_O07 GQ0207_E14  Different LG
796     GQ0208_B04 GQ04104_A08  Different LG
1083    GQ02760_G09 GQ03239_A14 Different LG
211078  GQ02809_N07 WS00840_J13 Different LG
2734    GQ03101_L16 GQ03006_M20 Different LG
5840    GQ03103_I13 GQ03809_F14 Different LG
6839    GQ04002_N11 GQ03113_A04 Different LG
528     GQ03506_I15 GQ03236_A04 Different LG
69383   GQ03707_J09 GQ03414_O14 Different LG
16080   GQ03711_H19 GQ04104_C19 Different LG
40      GQ03918_F22 WS02619_N01 Different LG
7303    GQ03233_I21 GQ0015_F01  Same LG, right order
311     GQ02819_E10 GQ0021_M13  Same LG, right order

Thanks Austin!

sahammond commented 6 years ago

You're welcome! Will close for now but feel free to reopen if problems arise.