Adoni5 / mappy-rs

Python wrapping multithreaded bindings to minimap2
MIT License
5 stars 0 forks source link

Multi threaded interface #13

Open alexomics opened 1 year ago

alexomics commented 1 year ago

We can extend the base mappy-rs.Aligner, which is currently single-threaded, to use multiple threads on iterables of data.

Proposed minimal interface:

Extended interface:

alexomics commented 1 year ago

I suppose there is also a choice on whether this is an impl on Aligner or we do ThreadedAligner

Adoni5 commented 1 year ago

Yes I like this

Adoni5 commented 1 year ago

I can't decide what the right approach is RE. Threaded Aligner If we do a Threaded Aligner, what reason is there to implement Mappy Compatibility, as we can just ignore it in favour of Threaded Aligner.

alexomics commented 1 year ago

Hypothetically:

pub struct ThreadedAligner {
    pub aligner: Aligner,
    work_queue: Arc<Mutex<VecDeque<PyDict>>>,
    result_queue: Arc<Mutex<VecDeque<PyDict>>>,
}

Where Aligner is the mappy-rs::Aligner.

Reasons to keep mappy compatibility are for drop-in single threaded apps. Not sure if we want to add the queue baggage to the base case? Though it would make for a nicer one-stop-shop

Adoni5 commented 1 year ago

So the One-Stop-Shop thing is what I'm thinking of. If you are only doing single threaded cases, you should just be using regular mappy, so we shouldn't worry about the queue baggage on Aligner?

All we should worry about is returning Mappy compatible mappings, so downstream processing isn't affected

Adoni5 commented 1 year ago

We could also try with rayon if we wanted to?