dorado basecaller on linux with Tesla K20Xm

rica01 commented 11 months ago

Hello everyone.

I have a system with 4 Tesla K20Xm, driver ver. 470, and cuda 11.4. I am trying to run a test basecall but i get the following error:

12:59:12 || ~/tmp

[ricardo@bart]$ /opt/dorado-0.4.3-linux-x64/bin/dorado basecaller dna_r10.4.1_e8.2_400bps_hac@v4.1.0/ ./yeast/ > calls.bam

[2023-12-04 12:59:16.201] [info] > Creating basecall pipeline

[2023-12-04 12:59:22.446] [error] CUDA error: no kernel image is available for execution on the device

CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.

For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Exception raised from c10_cuda_check_implementation at /pytorch/pyold/c10/cuda/CUDAException.cpp:44 (most recent call first):

frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x7fb88e782a77 in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x64 (0x7fb887d0712b in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #2: c10::cuda::c10_cuda_check_implementation(int, char const*, char const*, int, bool) + 0x118 (0x7fb88e74ca18 in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #3: void at::native::gpu_kernel_impl<at::native::FillFunctor<c10::Half> >(at::TensorIteratorBase&, at::native::FillFunctor<c10::Half> const&) + 0x9b1 (0x7fb88cf6ded1 in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #4: void at::native::gpu_kernel<at::native::FillFunctor<c10::Half> >(at::TensorIteratorBase&, at::native::FillFunctor<c10::Half> const&) + 0x33b (0x7fb88cf6e6fb in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #5: <unknown function> + 0x9216e95 (0x7fb88cf60e95 in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #6: at::native::fill_kernel_cuda(at::TensorIterator&, c10::Scalar const&) + 0x20 (0x7fb88cf61fc0 in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #7: <unknown function> + 0x49823c3 (0x7fb8886cc3c3 in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #8: <unknown function> + 0xa61c573 (0x7fb88e366573 in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #9: at::_ops::fill__Scalar::call(at::Tensor&, c10::Scalar const&) + 0x12c (0x7fb888e1d94c in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #10: at::native::zero_(at::Tensor&) + 0xa7 (0x7fb8886cca87 in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #11: <unknown function> + 0xa61b8cd (0x7fb88e3658cd in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #12: at::_ops::zero_::call(at::Tensor&) + 0x129 (0x7fb88925a4b9 in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #13: at::native::zeros_symint(c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>) + 0x160 (0x7fb8889769e0 in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #14: <unknown function> + 0x588d665 (0x7fb8895d7665 in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #15: at::_ops::zeros::redispatch(c10::DispatchKeySet, c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>) + 0xd5 (0x7fb888dd6735 in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #16: <unknown function> + 0x56c4855 (0x7fb88940e855 in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #17: at::_ops::zeros::call(c10::ArrayRef<c10::SymInt>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>) + 0x1b1 (0x7fb888e312c1 in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #18: at::native::cudnn_rnn::copy_weights_to_flat_buf_views(c10::ArrayRef<at::Tensor>, long, long, long, long, long, long, bool, bool, cudnnDataType_t, c10::TensorOptions const&, bool, bool, bool) + 0x3d0 (0x7fb88c734890 in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #19: at::native::_cudnn_rnn_flatten_weight(c10::ArrayRef<at::Tensor>, long, long, long, long, long, long, bool, bool) + 0x90 (0x7fb88c7354d0 in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #20: <unknown function> + 0xa6310a9 (0x7fb88e37b0a9 in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #21: <unknown function> + 0xa6670cf (0x7fb88e3b10cf in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #22: <unknown function> + 0x52c4fe4 (0x7fb88900efe4 in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #23: at::_ops::_cudnn_rnn_flatten_weight::call(c10::ArrayRef<at::Tensor>, long, c10::SymInt, long, c10::SymInt, c10::SymInt, long, bool, bool) + 0x386 (0x7fb888f81556 in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #24: <unknown function> + 0x80b3cc6 (0x7fb88bdfdcc6 in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #25: torch::nn::detail::RNNImplBase<torch::nn::LSTMImpl>::flatten_parameters() + 0x346 (0x7fb88be06de6 in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #26: void torch::nn::Module::to_impl<c10::Device&, bool&>(c10::Device&, bool&) + 0xd0 (0x7fb88bd300f0 in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #27: torch::nn::Module::to(c10::Device, bool) + 0x1c (0x7fb88bd292dc in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #28: void torch::nn::Module::to_impl<c10::Device&, bool&>(c10::Device&, bool&) + 0xd0 (0x7fb88bd300f0 in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #29: torch::nn::Module::to(c10::Device, bool) + 0x1c (0x7fb88bd292dc in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #30: /opt/dorado-0.4.3-linux-x64/bin/dorado() [0x9002d6]

frame #31: /opt/dorado-0.4.3-linux-x64/bin/dorado() [0x8fd787]

frame #32: /opt/dorado-0.4.3-linux-x64/bin/dorado() [0x8f3ef6]

frame #33: /opt/dorado-0.4.3-linux-x64/bin/dorado() [0x87ea8b]

frame #34: /opt/dorado-0.4.3-linux-x64/bin/dorado() [0x87eb4b]

frame #35: <unknown function> + 0x99f68 (0x7fb882e45f68 in /lib/x86_64-linux-gnu/libc.so.6)

frame #36: /opt/dorado-0.4.3-linux-x64/bin/dorado() [0x87f18f]

frame #37: /opt/dorado-0.4.3-linux-x64/bin/dorado() [0x882fd0]

frame #38: <unknown function> + 0x1196e440 (0x7fb8956b8440 in /opt/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so)

frame #39: <unknown function> + 0x94b43 (0x7fb882e40b43 in /lib/x86_64-linux-gnu/libc.so.6)

frame #40: <unknown function> + 0x126a00 (0x7fb882ed2a00 in /lib/x86_64-linux-gnu/libc.so.6)

I've been fighting with this for a while and I cannot find the solution. Would anyone be able to provide some help into the matter?

Thank you. -Ricardo

ritma001 commented 11 months ago

I have the same error from Tesla M60 (Maxwell architecture); Driver Version: 530.30.02; CUDA 12.1:

$ ~/dorado-0.4.3-linux-x64/bin/dorado basecaller ~/dorado-0.4.3-linux-x64/model/dna_r9.4.1_e8_sup@v3.6 ~/nanopore/e188458/read.pod5 --kit-name SQK-RBK004 --min-qscore 8 [2023-12-04 11:10:15.326] [info] > Creating basecall pipeline [2023-12-04 11:10:19.265] [error] CUDA error: no kernel image is available for execution on the device CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile withTORCH_USE_CUDA_DSA` to enable device-side assertions.

Exception raised from c10_cuda_check_implementation at /pytorch/pyold/c10/cuda/CUDAException.cpp:44 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x7f1a6c92fa77 in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #1: c10::detail::torchCheckFail(char const, char const, unsigned int, std::string const&) + 0x64 (0x7f1a65eb412b in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #2: c10::cuda::c10_cuda_check_implementation(int, char const, char const, int, bool) + 0x118 (0x7f1a6c8f9a18 in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #3: void at::native::gpu_kernel_impl<at::native::FillFunctor >(at::TensorIteratorBase&, at::native::FillFunctor const&) + 0x9b1 (0x7f1a6b11aed1 in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #4: void at::native::gpu_kernel<at::native::FillFunctor >(at::TensorIteratorBase&, at::native::FillFunctor const&) + 0x33b (0x7f1a6b11b6fb in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #5: + 0x9216e95 (0x7f1a6b10de95 in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #6: at::native::fill_kernel_cuda(at::TensorIterator&, c10::Scalar const&) + 0x20 (0x7f1a6b10efc0 in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #7: + 0x49823c3 (0x7f1a668793c3 in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #8: + 0xa61c573 (0x7f1a6c513573 in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #9: at::_ops::fill__Scalar::call(at::Tensor&, c10::Scalar const&) + 0x12c (0x7f1a66fca94c in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torchlib.so) frame #10: at::native::zero(at::Tensor&) + 0xa7 (0x7f1a66879a87 in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #11: + 0xa61b8cd (0x7f1a6c5128cd in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #12: at::ops::zero::call(at::Tensor&) + 0x129 (0x7f1a674074b9 in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #13: at::native::zeros_symint(c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x160 (0x7f1a66b239e0 in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #14: + 0x588d665 (0x7f1a67784665 in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #15: at::_ops::zeros::redispatch(c10::DispatchKeySet, c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0xd5 (0x7f1a66f83735 in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #16: + 0x56c4855 (0x7f1a675bb855 in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #17: at::_ops::zeros::call(c10::ArrayRef, c10::optional, c10::optional, c10::optional, c10::optional) + 0x1b1 (0x7f1a66fde2c1 in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #18: at::native::cudnn_rnn::copy_weights_to_flat_buf_views(c10::ArrayRef, long, long, long, long, long, long, bool, bool, cudnnDataType_t, c10::TensorOptions const&, bool, bool, bool) + 0x3d0 (0x7f1a6a8e1890 in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #19: at::native::_cudnn_rnn_flatten_weight(c10::ArrayRef, long, long, long, long, long, long, bool, bool) + 0x90 (0x7f1a6a8e24d0 in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #20: + 0xa6310a9 (0x7f1a6c5280a9 in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #21: + 0xa6670cf (0x7f1a6c55e0cf in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #22: + 0x52c4fe4 (0x7f1a671bbfe4 in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #23: at::_ops::_cudnn_rnn_flatten_weight::call(c10::ArrayRef, long, c10::SymInt, long, c10::SymInt, c10::SymInt, long, bool, bool) + 0x386 (0x7f1a6712e556 in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #24: + 0x80b3cc6 (0x7f1a69faacc6 in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #25: torch::nn::detail::RNNImplBase::flatten_parameters() + 0x346 (0x7f1a69fb3de6 in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #26: void torch::nn::Module::to_impl<c10::Device&, bool&>(c10::Device&, bool&) + 0xd0 (0x7f1a69edd0f0 in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #27: torch::nn::Module::to(c10::Device, bool) + 0x1c (0x7f1a69ed62dc in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #28: void torch::nn::Module::to_impl<c10::Device&, bool&>(c10::Device&, bool&) + 0xd0 (0x7f1a69edd0f0 in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #29: torch::nn::Module::to(c10::Device, bool) + 0x1c (0x7f1a69ed62dc in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #30: /home/writmaha/dorado-0.4.3-linux-x64/bin/dorado() [0x9002d6] frame #31: /home/writmaha/dorado-0.4.3-linux-x64/bin/dorado() [0x8fd787] frame #32: /home/writmaha/dorado-0.4.3-linux-x64/bin/dorado() [0x8f3ef6] frame #33: /home/writmaha/dorado-0.4.3-linux-x64/bin/dorado() [0x87ea8b] frame #34: /home/writmaha/dorado-0.4.3-linux-x64/bin/dorado() [0x87eb4b] frame #35: + 0xfe67 (0x7f1a60fdee67 in /usr/lib64/libpthread.so.0) frame #36: /home/writmaha/dorado-0.4.3-linux-x64/bin/dorado() [0x87f18f] frame #37: /home/writmaha/dorado-0.4.3-linux-x64/bin/dorado() [0x882fd0] frame #38: + 0x1196e440 (0x7f1a73865440 in /home/writmaha/dorado-0.4.3-linux-x64/bin/../lib/libdorado_torch_lib.so) frame #39: + 0x81ca (0x7f1a60fd71ca in /usr/lib64/libpthread.so.0) frame #40: clone + 0x43 (0x7f1a60314e73 in /usr/lib64/libc.so.6) `

vellamike commented 11 months ago

Dorado only supports execution with Nvidia GPUs from Pascal onwards so you won't be able to run Dorado with a K20 or M60. GPU performance has advanced considerably since this generation and upgrading to a more recent Nvidia GPU is strongly recommended.

Your issue does highlight that we should be handling this issue much more gracefully in Dorado, we will do this for a future release.

rica01 commented 11 months ago

would an older version of dorado work on k20?

vellamike commented 11 months ago

No, there is unfortunately no version of Dorado which would run on Kepler.

rica01 commented 11 months ago

I am sorry to keep asking, but would it work on Jetson Nano (Nvidia Maxwell architecture with 128 Nvidia CUDA cores)

vellamike commented 11 months ago

Unfortunately not, no Maxwell devices are supported by Dorado.

nanoporetech / dorado

dorado basecaller on linux with Tesla K20Xm #498