sstadick / perbase

Per-base per-nucleotide depth analysis
MIT License
116 stars 14 forks source link

parameters for sparse data #63

Open brentp opened 1 year ago

brentp commented 1 year ago

Hi Seth, do you have any recommendations for par_granges values for sparse data? I have some regions with very deep coverage that only span about 500KB of the genome so it seems that much time is spent with low CPU usage. thanks!

sstadick commented 1 year ago

Nothing off the top of my head. I'd take a guess though and say that larger chunksize would be better for sparse data if you aren't using an intervals file of some sort to restrict the regions. My confidence in my guess is low though, looking at the code all you want to minimize is calls to process_regions that have have no reads in that region, so larger chunksize should decrease the number of misses on that front.