yhoogstrate / fuma

:dash::leopard: FuMa: reporting overlap in RNA-seq detected fusion genes
GNU General Public License v3.0
5 stars 8 forks source link

Refactored code #32

Closed yhoogstrate closed 8 years ago

yhoogstrate commented 8 years ago

The aim of this PR is to refactor the code. Initially the code created sub datasets, because it was expected to export them time-wise and it was very handy for running unit tests. As this is no longer part of the plan, I am refactoring the code to do just the tasks it is supposed to do and which are described in the manuscript: create a large overview of all fusion genes. This refactoring reduces the memory again - this time a really big win for large number of samples.

For now it seems that the output is exactly the same, except for the order of the output.

Todo's:

yhoogstrate commented 8 years ago

For subset matching the following happens with the current implementation:

1=F1(A,B)
2=F2(B,C)
3=F3(A,B,C)

3=MF(F1,F3)=>MF(A,B)

then merging F2(B,C) with MF(A,B) results in no match