swarris / pyPaSWAS

Program for DNA/RNA/protein sequence alignment, read mapping and trimming. Extended python version of PaSWAS, supporting OpenCL and CUDA devices.
MIT License
27 stars 8 forks source link

cuModuleGetFunction failed #6

Closed r-barnes closed 6 years ago

r-barnes commented 6 years ago

I'm running PyPaSWAS as follows:

python3 ./implementations/warris2018/pypaswas.py -o zout --loglevel=DEBUG --outputformat=TXT -p aligner --filetype1=fasta --filetype2=fasta -O OVERRIDE_OUTPUT -M DNA-RNA --device=$device --recompile=T --short_sequences=F --framework=CUDA $queryfile $databasefile

I get several error messages:

INFO - Initializing application...
DEBUG - Initializing Score...
DEBUG - Initializing score finished.
DEBUG - Initializing DnaRnaScore...
DEBUG - Creating matrix with parameters:
        match_score: 5,
        mismatch_score: -3,
        gap_score: -5.0,
        other_score: -1,
        any_score: 0
DEBUG - Initializing DnaRnaScore finished.
INFO - Application initialized.
INFO - Setting program...
DEBUG - Initializing aligner...
DEBUG - Initializing hitlist...
DEBUG - Initializing hitlist OK.
DEBUG - Setting SW...
DEBUG - Using CUDA implementation
DEBUG - Initializing SmithWaterman.
INFO - No gap extension penalty detected: using original PaSWAS scoring algorithm
DEBUG - Going to initialize device... with number 0
DEBUG - Initializing device 0
DEBUG - Aligner initialized.
INFO - Program set.
DEBUG - Initializing hitlist...
DEBUG - Initializing hitlist OK.
INFO - Reading query sequences 0 1000000...
DEBUG - Initializing reader
    path = /autofs/nccs-svm1_home1/spinyfan/crd-swgpu/data/ant-500.fasta
    limitlength = 100000...
DEBUG - Initializing reader finished.
DEBUG - Reading from fasta file...
DEBUG -     250 sequences read.
DEBUG - Sorting records on length...
INFO - Query sequences OK.
INFO - Reading target sequences 0, 100000000...
DEBUG - Initializing reader
    path = /autofs/nccs-svm1_home1/spinyfan/crd-swgpu/data/ant-500.fasta
    limitlength = 100000...
DEBUG - Initializing reader finished.
DEBUG - Reading from fasta file...
DEBUG -     250 sequences read.
DEBUG - Sorting records on length...
INFO - Target sequences OK.
INFO - Processing 250- vs 500-sequences
DEBUG - Aligner processing...
DEBUG - At target: 0 of 500
DEBUG - Total memory on Device: 15983.75
DEBUG - Total memory on Device: 15983.75
DEBUG - Initializing hitlist...
DEBUG - Initializing hitlist OK.
DEBUG - Clearing device memory.
DEBUG - Total memory on Device: 15983.75
DEBUG - Compiling cuda code.
DEBUG - Converting score to string...
DEBUG - Allocated: 15215.931701660156MB of memory
DEBUG - At sequence: 0 of 250, length = 28
DEBUG - Calculating scores.

WARNING - Warning: cuCtxSynchronize failed: an illegal memory access was encountered
Continuing calculation...
WARNING - Warning: cuModuleGetFunction failed: an illegal memory access was encountered
Continuing calculation...
[In total, there are 456 cuModuleGetFunction warnings...]
DEBUG - Performing back trace.
ERROR - Something went wrong during traceback: cuModuleGetFunction failed: an illegal memory access was encountered...
ERROR - cuModuleGetFunction failed: an illegal memory access was encountered
Traceback (most recent call last):
  File "./implementations/warris2018/pypaswas.py", line 11, in <module>
    ppw.run()
  File "/autofs/nccs-svm1_home1/spinyfan/crd-swgpu/implementations/warris2018/pyPaSWAS/pypaswasall.py", line 235, in run
    results.extend(self.program.process(query_sequences, target_sequences, self))
  File "/autofs/nccs-svm1_home1/spinyfan/crd-swgpu/implementations/warris2018/pyPaSWAS/Core/Programs.py", line 82, in process
    results = self.smith_waterman.align_sequences(records_seqs, targets, target_index)
  File "/autofs/nccs-svm1_home1/spinyfan/crd-swgpu/implementations/warris2018/pyPaSWAS/Core/SmithWaterman.py", line 534, in align_sequences
    self._traceback_host()
  File "/autofs/nccs-svm1_home1/spinyfan/crd-swgpu/implementations/warris2018/pyPaSWAS/Core/SmithWaterman.py", line 640, in _traceback_host
    self._execute_traceback_kernel(number_of_blocks, idx, idy)
  File "/autofs/nccs-svm1_home1/spinyfan/crd-swgpu/implementations/warris2018/pyPaSWAS/Core/SmithWatermanCuda.py", line 266, in _execute_traceback_kernel
    raise exception
  File "/autofs/nccs-svm1_home1/spinyfan/crd-swgpu/implementations/warris2018/pyPaSWAS/Core/SmithWatermanCuda.py", line 249, in _execute_traceback_kernel
    traceback_function = self.module.get_function("traceback")
  File "/ccs/home/spinyfan/os/anaconda3/lib/python3.5/site-packages/pycuda/compiler.py", line 278, in get_function
    return self.module.get_function(name)
pycuda._driver.LogicError: cuModuleGetFunction failed: an illegal memory access was encountered
ERROR - Something went wrong during traceback: cuModuleGetFunction failed: an illegal memory access was encountered...
Program ended. The message was:  cuModuleGetFunction failed: an illegal memory access was encountered
Please use the option --help for information on command line arguments.
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: an illegal memory access was encountered
[Several more of the cuMemFree warnings...]

I'm running with CUDA 9.0.69, Anaconda Python 3.5.5, numpy 1.14.3, biopython 1.71. The system has a Tesla P100 and 15215.931701660156MB of 15983.75MB are being allocated (maybe there's an issue with trying to allocate a millionth of a byte? ;-) )

Any thoughts as to what might be going on?

r-barnes commented 6 years ago

This line

INFO - Processing 250- vs 500-sequences

Also seems odd, both input files are pointing to the same file, which should lead to a 250 v 250 comparison.

r-barnes commented 6 years ago

Switching the end of _execute_calculate_score_kernel in SmithWatermanCuda.py to read:

        except Exception as exception:
            self.logger.error('Warning: {0}\nContinuing calculation...'.format(exception))
            raise exception

gives:

DEBUG - Compiling cuda code.
DEBUG - Converting score to string...
DEBUG - Allocated: 15215.931701660156MB of memory
DEBUG - At sequence: 0 of 250, length = 28
DEBUG - Calculating scores.
ERROR - Warning: cuCtxSynchronize failed: an illegal memory access was encountered
Continuing calculation...
ERROR - cuCtxSynchronize failed: an illegal memory access was encountered
Traceback (most recent call last):
  File "./implementations/warris2018/pypaswas.py", line 11, in <module>
    ppw.run()
  File "/autofs/nccs-svm1_home1/spinyfan/crd-swgpu/implementations/warris2018/pyPaSWAS/pypaswasall.py", line 235, in run
    results.extend(self.program.process(query_sequences, target_sequences, self))
  File "/autofs/nccs-svm1_home1/spinyfan/crd-swgpu/implementations/warris2018/pyPaSWAS/Core/Programs.py", line 82, in process
    results = self.smith_waterman.align_sequences(records_seqs, targets, target_index)
  File "/autofs/nccs-svm1_home1/spinyfan/crd-swgpu/implementations/warris2018/pyPaSWAS/Core/SmithWaterman.py", line 530, in align_sequences
    self._calculate_score()
  File "/autofs/nccs-svm1_home1/spinyfan/crd-swgpu/implementations/warris2018/pyPaSWAS/Core/SmithWaterman.py", line 609, in _calculate_score
    self._execute_calculate_score_kernel(number_of_blocks, idx, idy)
  File "/autofs/nccs-svm1_home1/spinyfan/crd-swgpu/implementations/warris2018/pyPaSWAS/Core/SmithWatermanCuda.py", line 226, in _execute_calculate_score_kernel
    raise exception
  File "/autofs/nccs-svm1_home1/spinyfan/crd-swgpu/implementations/warris2018/pyPaSWAS/Core/SmithWatermanCuda.py", line 222, in _execute_calculate_score_kernel
    driver.Context.synchronize()  #@UndefinedVariable @IgnorePep8
pycuda._driver.LogicError: cuCtxSynchronize failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFreeHost failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFreeHost failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: an illegal memory access was encountered
swarris commented 6 years ago

Thanks for the information! I have never seen this problem before.

But first the easy part:

INFO - Processing 250- vs 500-sequences

With DNA sequences the target sequences will automatically be aligned in reverse complement too. This is located in pyPaSWAS/pypaswasall.py lines 117/118 if you'd like to disable this. At some point I need to make a CL switch for this.

The memory issue is new. The first thing that stands out is the amount of memory you allocated: ~15Gb. The cards I used have never exceeded 4Gb. Although I did use ~64GB on an Intel CPU system. I'm not sure all pointers in the CUDA code are 64bits... Could you check two things for me?

  1. does --framework=opencl work? Will most likely also fail, but you never know...
  2. --maximum_memory_usage=0.1 (this will make sure pyPaSWAS does not allocate more than 10% of memory).

Thanks, Sven

r-barnes commented 6 years ago

Thanks @swarris!

I tried using --maximum_memory_usage=0.1, but didn't see a change in behaviour.

python3 ./implementations/warris2018/pypaswas.py -o zout --loglevel=DEBUG --outputformat=TXT -p aligner --filetype1=fasta --filetype2=fasta -O OVERRIDE_OUTPUT -M DNA-RNA --device=$device --recompile=T --short_sequences=F --framework=CUDA --maximum_memory_usage=0.1 $queryfile $databasefile
INFO - Initializing application...
DEBUG - Initializing Score...
DEBUG - Initializing score finished.
DEBUG - Initializing DnaRnaScore...
DEBUG - Creating matrix with parameters:
        match_score: 5,
        mismatch_score: -3,
        gap_score: -5.0,
        other_score: -1,
        any_score: 0
DEBUG - Initializing DnaRnaScore finished.
INFO - Application initialized.
INFO - Setting program...
DEBUG - Initializing aligner...
DEBUG - Initializing hitlist...
DEBUG - Initializing hitlist OK.
DEBUG - Setting SW...
DEBUG - Using CUDA implementation
DEBUG - Initializing SmithWaterman.
INFO - No gap extension penalty detected: using original PaSWAS scoring algorithm
DEBUG - Going to initialize device... with number 0
DEBUG - Initializing device 0
DEBUG - Aligner initialized.
INFO - Program set.
DEBUG - Initializing hitlist...
DEBUG - Initializing hitlist OK.
INFO - Reading query sequences 0 1000000...
DEBUG - Initializing reader
    path = /autofs/nccs-svm1_home1/spinyfan/crd-swgpu/data/ant-500.fasta
    limitlength = 100000...
DEBUG - Initializing reader finished.
DEBUG - Reading from fasta file...
DEBUG -     250 sequences read.
DEBUG - Sorting records on length...
INFO - Query sequences OK.
INFO - Reading target sequences 0, 100000000...
DEBUG - Initializing reader
    path = /autofs/nccs-svm1_home1/spinyfan/crd-swgpu/data/ant-500.fasta
    limitlength = 100000...
DEBUG - Initializing reader finished.
DEBUG - Reading from fasta file...
DEBUG -     250 sequences read.
DEBUG - Sorting records on length...
INFO - Target sequences OK.
INFO - Processing 250- vs 500-sequences
DEBUG - Aligner processing...
DEBUG - At target: 0 of 500
DEBUG - Total memory on Device: 15983.75
DEBUG - Total memory on Device: 15983.75
DEBUG - Initializing hitlist...
DEBUG - Initializing hitlist OK.
DEBUG - Clearing device memory.
DEBUG - Total memory on Device: 15983.75
DEBUG - Compiling cuda code.
DEBUG - Converting score to string...
DEBUG - Allocated: 1746.7488174438477MB of memory
DEBUG - At sequence: 0 of 250, length = 10
DEBUG - Calculating scores.
ERROR - Warning: cuCtxSynchronize failed: an illegal memory access was encountered
Continuing calculation...
ERROR - cuCtxSynchronize failed: an illegal memory access was encountered
Traceback (most recent call last):
  File "./implementations/warris2018/pypaswas.py", line 11, in <module>
    ppw.run()
  File "/autofs/nccs-svm1_home1/spinyfan/crd-swgpu/implementations/warris2018/pyPaSWAS/pypaswasall.py", line 235, in run
    results.extend(self.program.process(query_sequences, target_sequences, self))
  File "/autofs/nccs-svm1_home1/spinyfan/crd-swgpu/implementations/warris2018/pyPaSWAS/Core/Programs.py", line 82, in process
    results = self.smith_waterman.align_sequences(records_seqs, targets, target_index)
  File "/autofs/nccs-svm1_home1/spinyfan/crd-swgpu/implementations/warris2018/pyPaSWAS/Core/SmithWaterman.py", line 530, in align_sequences
    self._calculate_score()
  File "/autofs/nccs-svm1_home1/spinyfan/crd-swgpu/implementations/warris2018/pyPaSWAS/Core/SmithWaterman.py", line 609, in _calculate_score
    self._execute_calculate_score_kernel(number_of_blocks, idx, idy)
  File "/autofs/nccs-svm1_home1/spinyfan/crd-swgpu/implementations/warris2018/pyPaSWAS/Core/SmithWatermanCuda.py", line 226, in _execute_calculate_score_kernel
    raise exception
  File "/autofs/nccs-svm1_home1/spinyfan/crd-swgpu/implementations/warris2018/pyPaSWAS/Core/SmithWatermanCuda.py", line 222, in _execute_calculate_score_kernel
    driver.Context.synchronize()  #@UndefinedVariable @IgnorePep8
pycuda._driver.LogicError: cuCtxSynchronize failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFreeHost failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFreeHost failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuModuleUnload failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: an illegal memory access was encountered
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: an illegal memory access was encountered

Working on trying the OpenCL framework now.

r-barnes commented 6 years ago

It doesn't appear as though OpenCL is available on my machine :-/

swarris commented 6 years ago

It doesn't appear as though OpenCL is available on my machine :-/

That is odd. For NVIDIA GPUs OpenCL should be available by default: https://developer.nvidia.com/opencl Did you try to run another CUDA or OpenCL programs? Does the example from pyCUDA run for instance?

I'll try to update my system to cuda 9.2.

r-barnes commented 6 years ago

The pyCUDA example runs without issue. I'm inquiring about OpenCL with the sysadmins.

swarris commented 6 years ago

My VM is connected to an GRID K2, but this GPU is not supported for cuda 9... So I had to revert to cuda 7.5. I'll try to find another machine to test it on.

swarris commented 6 years ago

Could you share your data with me so I can test it better?

Thanks!

swarris commented 6 years ago

On a fresh install Ubuntu 18.04, CUDA 9.1, GeForce GTX 650 and pyCUDA/pyOpenCL I can use the data found in the 'data' folder without a glitch.

swarris commented 6 years ago

Has it been resolved?