neoml-lib / neoml

Machine learning framework for both deep learning and traditional algorithms
https://www.abbyy.com/neoml/
Apache License 2.0
768 stars 126 forks source link

Runtime error without any description on CIFAR-10 tutorial from documentation. #462

Open Kirundel opened 3 years ago

Kirundel commented 3 years ago

When trying to run the code from the documentation with using a GPU, the program interrupts execution with RuntimeError without any description. When trying to run the code with using neoml.MathEngine.CpuMathEngine, it works.

BugNeoML

OS: Windows 10 (10.0.19043) Python 3.9.7 GPU: Nvidia GeForce 1070

FedyuninV commented 3 years ago

Hi there!

Thanks for the report. It looks like a bug which leads to infinite recursion. That's why it will be helpful to know how you got the Python package. If you've built it locally, could you please provide more info about exact version of CUDA and your C++ compiler (the exact version of the compiler may be crucial info)?

Kirundel commented 3 years ago

Python package was installed from pip. Detected GPU has prefix vulcan instead of cuda in info math_engine.info. Vulcan version: 1.2.142.

FedyuninV commented 3 years ago

Currently NeoML doesn't support training for Vulkan math engine. What remains unclear here: why Vulkan math engine was created instead of CUDA (which is by default when GPU is made by NVidia)?

First of all, can you update your Nvidia driver? It's possible that your current driver is OK for Vulkan but kind of outdated for CUDA.