swarris / Pacasus

Correction of palindromes in long reads from PacBio and Nanopore
MIT License
15 stars 3 forks source link

Warning: cuModuleGetFunction failed: an illegal memory access was encountered #7

Closed peterdfields closed 5 years ago

peterdfields commented 5 years ago

Hi,

I'm trying to run Pacasus on an OpenSUSE machine with a Tesla K40c GPU (CUDA v. 9.2). The machine seems capable of running the CPU version of the algorithm but I cannot seem to get things to work with the NVIDIA GPU. I have attempted to follow the troubleshooting described here for pyPaSWAS but these suggestions (e.g. changing to opencl or limiting memory use) don't change the behavior. I see that the aforementioned issue sort of went dead and wondered if you might provide additional suggestions I might try to track down what's going wrong? You can see a full log of a run at the following gist: https://gist.github.com/peterdfields/2da6255e165e7c9194182a054834d23f

Thank you for your time and advice.

swarris commented 5 years ago

Hi Peter,

I indeed did not get any additional information, so I closed the issue. Did you manage to get the test data running with pyPaSWAS? So, doing just basic sequence alignment? Could you send me the first 50 reads in your data set? Maybe I can recreate the problem on my systems.

Cheers, Sven

swarris commented 5 years ago

It could very well be that some of the sequences are too long to be processed on the GPU. The underlying Smith-Waterman requires matrices of NxN, which limits the use for GPUs to reads up to 20kb/25kb (depending on the available memory). OpenCL on the CPU can go up to 64GB, hence these reads > 25kb can be processed on the CPU (up to 50kb or so). You can test this by either removing these long reads from the data set or use --limit_length=20000 on the command line.

peterdfields commented 5 years ago

Hi @swarris

Thank you for your reply. It does seem like the addition of the --limit_length=20000 flag allows the run to continue. However, after about 20k sequences I got an error that I'm not really sure how to deal with: ('Program ended. The message was: ', '') Please use the option --help for information on command line arguments.

Does this sort of message seem familiar to you?

swarris commented 5 years ago

Hi,

pyPaSWAS now reports that the read length is the most-likely problem and Pacasus will now also stop processing when this happens.

Not sure what the current message means. The exception appears to be empty? I changed pacasus to print the exception when this happens. Could you pull the repository and try again?

peterdfields commented 5 years ago

Hi @swarris,

I pulled the updated repository and re-ran the command. This time it doesn't seem that the error occurred as the run has now made it into ~40k sequences. So at least for now there doesn't seem to be an error. I will close this issue now. Thank you again for your help.