Closed swarbred closed 4 years ago
@lucventurini I haven't checked beyond specific known examples but I would like to bring together all the recent changes that we believe are fixed including the serialise memory issue so that I can then use this version for a larger run.
Up to you if you want at this point to merge to master or to another branch with all these changes
@swarbred I will merge to master
, I think it is time to bring everything together.
I will contestually close #255, #263, #266 and #267.
@swarbred
Could you please test 0c57a76 (new develop
branch) which squashes together the edits coming from four different branches?
There were no conflicts in merging, which is good (ie: the branches were really completely working on different parts of the code).
After testing, we can put these changes in master
and close four issues.
@swarbred
I am really sorry to say that unfortunately I woke up and realised that there was a bug in the retained intron procedure :-(
In the following figure:
0c57a76 would not mark any of these as having a retained intron, let alone having their CDS mangled by having one. This is clearly wrong, I think.
I fixed the situation in d094f995 (always develop
branch). Many apologies for this (I did say that this part of the code is having me tearing my hair out!)
@lucventurini :-( ok I will install and rerun my runs later today, can I clarify that the issue is specific to single exon models as shown above. If so then I agree with your fix but it's less of an issue for my data :-) as we will be excluding these transcripts for other reasons.
also I assume as you indicated this is on the develop branch it includes all the recent changes as 0c57a76
@lucventurini :-( ok I will install and rerun my runs later today, can I clarify that the issue is specific to single exon models as shown above. If so then I agree with your fix but it's less of an issue for my data :-) as we will be excluding these transcripts for other reasons.
Yes, it's specific to single exon models only.
also I assume as you indicated this is on the develop branch it includes all the recent changes as 0c57a76
Yes, correct.
@lucventurini Based on my full runs, I consider this resolved in d094f99
I'm seeing a difference between mikado-2.0rc4 and mikado-2.0rc6_* versions in relation to calling and exclusion of transcripts with retained introns.
I'm attaching a screenshot of this region http://apollo.tgac.ac.uk/Myzus_persicae_O_v2_genome_browser/jbrowse/?loc=scaffold_1%3A58658101..58667080&tracks=DNA%2CAnnotations%2CMikado_annotation_run6_classification%2CMikado_integration_run4%2CScallop_lncRNA%2CStringtie_lncRNA%2CYa_locus&highlight=
The input models are shown in track ..run6_classification the output models of running mikado version mikado-2.0rc6_3f62484_CBG are shown in track mikado integration run 4
Two models are excluded from the original input mikado.scaffold_1G6704.4 (correctly as a retained intron transcript) and mikado.scaffold_1G6704.3
This mikado.scaffold_1G6704.3 model I dont think should be viewed as a retained intron transcript and running previous versions mikado-2.0_rc1 and mikado-2.0rc4 on the same input models and config gives mikado.scaffold_1G6704.3 in the output.
For my own knowledge Luca can you confirm that for the retained intron check the order (i.e. the relative scoring) of the transcripts matters i.e. each potential alt splice model is assessed against the primary model and the other models currently added to the locus. So if a transcript is the second highest scoring it might not be regarded as having a retained intron relative to the primary model but if the same transcript scored lower i.e. other transcripts were added before the retained intron check was made then potentially against these it now may have a retained intron and be excluded.
Correct version mikado-2.0rc4
output directory
Incorrect? version
output directory /tgac/workarea/group-ga/Projects/CB-GENANNO-444_Myzus_persicae_clone_O_v2_annotation/Analysis/mikado-2.0rc4/annotation_run2/mikado-2.0_rc1_run6/integration/integration_run4_dstest8