zaeleus / noodles

Bioinformatics I/O libraries in Rust
MIT License
482 stars 52 forks source link

Add reserve method to Cigar for use in `get_cigar` #206

Closed tshauck closed 11 months ago

tshauck commented 11 months ago

Hi,

I'd like to propose a PR that I think could improve BAM cigar parsing performance. I noticed while profiling a tool of mine that calls to get_cigar may have do n allocations based on the number of cigar ops. Looking at the get_cigar function, it already takes the number of ops, so this PR adds a reserve method on Cigar so that get_cigar can make use of it.

I did a little benchmark via criterion that seems to indicate reduced runtime and reduced runtime variance.

image

I put the (slightly overcomplicated 😅) benchmark code in a separate branch here: https://github.com/tshauck/noodles/tree/add-reserve-to-cigar-decode-benches/target/criterion (with some changes to noodles-bam too).

zaeleus commented 11 months ago

While this performance difference can be observed in how your benchmarks are defined, i.e., creating a new buffer for each iteration; in practice, there shouldn't be a difference with the current BAM decoder. The CIGAR buffer in a sam::alignment::Record is reused for each record, and Vec::clear does not affect its internal capacity. (Also note that Vec::reserve is for an additional number of entries, not for the expected count unless the capacity is already 0.)

I noticed while profiling a tool of mine that calls to get_cigar may have do n allocations based on the number of cigar ops.

Can you share this report? When a Vec is empty, the first push allocates for a minimum of 4 items, and when at capacity, the size of the Vec doubles. E.g., there will be an allocation at item 1 (capacity 4), 5 (capacity 8), 9 (capacity 16), etc.

tshauck commented 11 months ago

Thanks for the reply. I should've added a little more color to my use case. I'll have a closer look tonight and follow up, but I think the lack of reuse is due to me working with a lazy BAM record, then calling let cigar: Cigar = record.cigar().try_into()?; to convert the lazy cigar into a regular one, and I think TryFrom does create a new Cigar.

https://github.com/zaeleus/noodles/blob/38836a42b67b9c68b1c106802296119dc5473767/noodles-bam/src/lazy/record/cigar.rs#L48-L62

Edit: If reserve is to be included, perhaps moving it into try_from might make more sense?

zaeleus commented 11 months ago

Ah, I see. Since this only affects the conversion use case, can we use the ops iterator instead of the decoder in the implementation of lazy::record::Cigar::try_from, i.e., call self.iter().collect()? The iterator has the correct size hint set, which preallocates the resulting Vec.

tshauck commented 11 months ago

Cool, thanks for the feedback. Please let me know if the update is what you had in mind.

zaeleus commented 11 months ago

Thanks! (And sorry for the slow responses; I just got back from vacation.)

tshauck commented 11 months ago

All good -- appreciate the education/patience