mgawan / GPU-BSW

Other
8 stars 6 forks source link

Does not work with one input sequence pair #14

Closed armintoepfer closed 3 years ago

armintoepfer commented 3 years ago

Hi!

If I use exactly one reference and one query sequence, it doesn't work.

head -n 2 ../test-data/dna-query.fasta > q.fasta
head -n 2 ../test-data/dna-reference.fasta > r.fasta
./program_gpu r.fasta q.fasta out.file21
GPUassert: invalid configuration argument ../driver.cpp

Thanks, Armin

mgawan commented 3 years ago

Hi Armin, the implementation uses 2 CUDA streams to overlap data transfers with computation, with only a single alignment that won't be possible hence the error. 2 Alignments should work fine, as you would also understand single alignment on GPU is not the optimal use of GPUs hence I did not consider this case while implementing. However, if this is necessary for your use-case (CPU would be way faster for this use case) I can change the code to work with a single alignment. Let me know.

armintoepfer commented 3 years ago

It's just an edge case that simply doesn't work. I understand that you want to hide latency, but that can also be achieved on a layer above your API by having multiple instances of your aligner running simultaneously

mgawan commented 3 years ago

May be I am not understanding your use-case completely but my idea was to process alignments in batches, for example to overwhelm a V100 device it takes about 20,000 to 30,000 alignments. Now if we launch them separately (1 alignment per API call), this would lead to 20,000 calls to memory allocations, memory copies and kernel launches and add so much overhead. The only case I found for a single alignment launch was debugging. We integrated this kernel in our meta-genomics software pipeline and do something similar to what you suggested (multiple instances of kernel running on same GPU) but each instance was still launching a batch of alignments. Here is the software I am talking about: https://bitbucket.org/berkeleylab/mhm2/wiki/Home