Closed jjc2718 closed 3 years ago
I don't think this is particularly slow relative to the rest of the mutation prediction process, so I'm going to close this. If this becomes a bottleneck in the future we can revisit, but I don't foresee this being a high priority.
See here: https://github.com/greenelab/mpmp/pull/29#discussion_r590807995
It should be possible to cache the set of samples that different datasets have in common somewhere, which would likely be faster than calculating the intersection from the sample_info files each time.