Closed harisankarsadasivan closed 2 years ago
We haven't benchmarked the times for this to the degree where I could comment. Like you I presume that this would be sub linear but by how much? I don't know.
It is a trivial exercise to benchmark this using the client and some reads.
If my read has 4000 samples and I invoke pyguppy caller on chunks of size 1000. Does it mean the first call on 1000 samples would take the most time? How much time would the next call on 2000,3000 and 4000 samples take? Is it going to scale linearly or sub-linearly? I would assume sub-linear. Or is it going to be same as computing with a chunk size of 4000? I would like to understand this scaling factor approximately. Please point me to any relevant code or numbers.