mourisl / T1K

T1K is a versatile methods to genotype highly polymorphic genes (e.g. KIR, HLA) with bulk or single-cell RNA-seq, WGS or WES data.
MIT License
42 stars 7 forks source link

Larger initial SimpleVector to reduce realloc calls #1

Closed lh3 closed 2 years ago

lh3 commented 2 years ago

SimpleVector was growing from size 1. It needed several realloc calls even for small vectors. Profiler suggested that realloc took significant time.

With this PR, SimpleVector grows from size 16. This reduces the running time of fastq-extractor from 4h12m to 3h on my data. There is probably more room for improvement but that will be more complex.

mourisl commented 2 years ago

Hi Heng, thank you for locating the issue. I think we can just change the "inc" in the constructor function for this. I will update other functions to only use "realloc".