zhongkaifu / Seq2SeqSharp

Seq2SeqSharp is a tensor based fast & flexible deep neural network framework written by .NET (C#). It has many highlighted features, such as automatic differentiation, different network types (Transformer, LSTM, BiLSTM and so on), multi-GPUs supported, cross-platforms (Windows, Linux, x86, x64, ARM), multimodal model for text and images and so on.
Other
193 stars 38 forks source link

GPU train error #54

Closed lijianxin520 closed 1 year ago

lijianxin520 commented 1 year ago

Hello, the picture below shows that I used GPU for training, but the loading error occurred. image

zhongkaifu commented 1 year ago

Can you please show more details for debugging? Such as call stack, config files, which tags in your data set and any other useful information.

lijianxin520 commented 1 year ago

Thank you very much for your attention, the same data I used runs without problem under THE CPU, but the time is very slow. Then I made a mistake when I changed the configuration information to GPU.

I suspect that there is something wrong with the GPU environment. My system environment is:

Windows 7, NV1060;

Software installed CUDA10.1; train_cut_opts_gpu.txt

Here is the data format: 1 B 9 I_first 9 I 8 I 0 I 1 I 0 I 1 I & I 0 I 1 I & I 0 I 0 I 1 I & I 0 I 0 I 1 I_end 迈 B 向 I_end 充 B 满 I_end 希 B 望 I_end 的 B_single 新 B_single 世 B 纪 I_end — B — I_end 一 B 九 I_first 九 I 八 I 年 I_end 新 B 年 I_end 讲 B 话 I_end ( B_single 附 B_single 图 B 片 I_end 1 B_single 张 B_single ) B_single

zhongkaifu commented 1 year ago

Seq2SeqSharp uses CUDA 11.4 or above by default. For older CUDA version, you can read this file: https://github.com/zhongkaifu/Seq2SeqSharp#using-different-cuda-versions-and-net-versions , and modify the project file. I already tested Seq2SeqSharp on Nvidia GTX 1060 (6G memory) and it works.

In addition, if your config file has this line, "CompilerOptions": "--use_fast_math --gpu-architecture=compute_70", you can remove it and retry it.

If you still have problem on it, please parse the call stack information here for debugging.

lijianxin520 commented 1 year ago

Thank you very much for your support. Let me try.

lijianxin520 commented 1 year ago

Hello, I did not reinstall the system, I adjusted the version of CUDA to 10.2; Nvrtc64_102_0. DLL and nvrTc-builtins64_102. DLL are copied to the program running directory. The following error occurs: image

zhongkaifu commented 1 year ago

Try to delete 'cuda_cache' folder and retry it. In addition, what's your config file looks like ? For "CompilerOptions" in config file, please delete settings for "--gpu-architecture" if you don't know which value is correct for GTX 1060.

lijianxin520 commented 1 year ago

Thank you very much for your help. I removed the configuration of 'CompilerOptions' and now I can run. And it's fast. Thanks again!