LiLabAtVT / DeepTE

Neural network classification of TE
BSD 3-Clause "New" or "Revised" License
85 stars 7 forks source link

Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR #14

Open OliveiraDS-hub opened 2 years ago

OliveiraDS-hub commented 2 years ago

I'm really interested in use DeepTE to classify 26 TE libraries, but I'm really naive with ML algorithms and I don't know how to solve the problem with tensorflow. I've installed DeepTE from conda repository and it seems everything is good. The problem is shown below:

deepte-1

I think the problem happens due to "Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR". Is it happening because I'm out of memory?

I tried to monitor the GPU memory usage and I think it's a memory problem:

deepe-2

The usage drops to zero exactly when DeepTE process is killed.

I have 32 cores and 158Gb RAM, but only 4GB Gb GPU. How can I solve this issue? I did a downsampling and then run DeepTE with only one TE sequence, but it still doesn't work.

songliVT commented 2 years ago

I suspect that the model file is too large to fit into your GPU memory.

Song

On Mon, Dec 20, 2021 at 10:26 AM oliveirads-bioinfo < @.***> wrote:

I'm really interested in use DeepTE to classify 26 TE libraries, but I'm really naive with ML algorithms and I don't know how to solve the problem with tensorflow. I've installed DeepTE from conda repository and it seems everything is good. The problem is shown below:

[image: deepte-1] https://user-images.githubusercontent.com/15055555/146790488-abe9d383-9bd1-4ba4-9e15-d58d64f4ec86.png

I think the problem happens due to "Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR". Is it happening because I'm out of memory?

I tried to monitor the GPU memory usage and I think it's a memory problem:

[image: deepe-2] https://user-images.githubusercontent.com/15055555/146790908-79a036fb-bcdf-47be-bc39-c1325fc8c4c5.png

The usage drops to zero exactly when DeepTE process is killed.

I have 32 cores and 158Gb RAM, but only 4GB Gb GPU. How can I solve this issue? I did a downsampling and then run DeepTE with only one TE sequence, but it still doesn't work.

— Reply to this email directly, view it on GitHub https://github.com/LiLabAtVT/DeepTE/issues/14, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACEEENQEHMW4ULJKLRCDQODUR5DKNANCNFSM5KN67V3Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Associate Professor in Plant Genomics and Bioinformatics School of Plant and Environmental Sciences Virginia Polytechnic Institute and State University

zoom https://virginiatech.zoom.us/j/8790572835

songliVT commented 2 years ago

One potential solution would be to use CPU only to evaluate this model, but we did not test this before.

Song

On Mon, Dec 20, 2021, 10:26 AM oliveirads-bioinfo @.***> wrote:

I'm really interested in use DeepTE to classify 26 TE libraries, but I'm really naive with ML algorithms and I don't know how to solve the problem with tensorflow. I've installed DeepTE from conda repository and it seems everything is good. The problem is shown below:

[image: deepte-1] https://user-images.githubusercontent.com/15055555/146790488-abe9d383-9bd1-4ba4-9e15-d58d64f4ec86.png

I think the problem happens due to "Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR". Is it happening because I'm out of memory?

I tried to monitor the GPU memory usage and I think it's a memory problem:

[image: deepe-2] https://user-images.githubusercontent.com/15055555/146790908-79a036fb-bcdf-47be-bc39-c1325fc8c4c5.png

The usage drops to zero exactly when DeepTE process is killed.

I have 32 cores and 158Gb RAM, but only 4GB Gb GPU. How can I solve this issue? I did a downsampling and then run DeepTE with only one TE sequence, but it still doesn't work.

— Reply to this email directly, view it on GitHub https://github.com/LiLabAtVT/DeepTE/issues/14, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACEEENQEHMW4ULJKLRCDQODUR5DKNANCNFSM5KN67V3Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>