Closed lydiayliu closed 2 years ago
I'll look into this transcript
is ENST00000265138.4 the last transcript ID being printed out?
yes in both cases. the log is here:
/hot/users/yiyangliu/MoPepGen/Variant/Fusion/arriba-2.1.0/ssm/CPCG0100.winu.3f.log
oh that's the same transcript as #297
The gene of this transcript has three fusions located in its region, 2 of which only has one transcript, while the other has 21 transcripts, so in the gvf parsed from Arriba output, there are 23 entries associated with this transcript. The accepter gene is pretty large and carries a lot of variants. Some transcripts of it has more than 100 variants (snv/indel). So all of these make the graph really big. So after applying all variants into the ThreeFrameTVG
, there are in total of 15600 nodes. This is just the unaligned variant graph, before fitting into codons. I haven't successfully translate and create the cleavage graph yet, and I'm believe it is going to be even larger.
To fully resolve this issue, I can't think of a way other than creating a 'splice graph' where each node is the sequence between any two splice sites, which is going to be a big project. Otherwise, I think we have to use some rules to limit the size of the graph, for example limiting the number of nucleotide of the accepter transcript, or maybe limit the number of transcripts (for example only consider breakpoint in exon). Any thoughts?
To fully resolve this issue, I can't think of a way other than creating a 'splice graph' where each node is the sequence between any two splice sites, which is going to be a big project.
yeah let's put that on hold XD
I have a question actually. Sooo with the current set up, if we have a fusion event that involves 2 donor transcripts and 21 acceptor transcripts, are we doing the following:
is that correct? if so, why not pair one donor transcript with one acceptor transcript at a time?
That's actually not a bad idea at all!
didn't die here as per f7d21e9
I've tried it twice and the run gets
Killed
at this transcript onF32
using all 62.76GiB of mem. Could try onF72
but there's likely a problem?