vdobrovolskii / wl-coref

This repository contains the code for EMNLP-2021 paper "Word-Level Coreference Resolution"
MIT License
104 stars 37 forks source link

GPU memory needed #29

Closed poncelettheo closed 2 years ago

poncelettheo commented 2 years ago

Hello, I want to run your code but as I do not have any GPU I have to ask for one from my laboratory... I was therefore wondering if you had any idea of what GPU memory is needed to run your code because it seams that even 32GB is not enough. Thank you for your understanding.

vdobrovolskii commented 2 years ago

Hi! It depends on what you want to do. For inference, a 4GB card should be enough. For training... Well, I trained it on a card with a lot of memory, so I didn't bother cutting the documents. There are some documents that are really big which are very demanding on memory, so you can easily OOM on training. You can try limiting the size of documents during training to reduce the memory consumption. For instance, if a document is [n_seqs, 512], I'd suggest randomly taking 2 out of n_seqs sequences during each epoch.

poncelettheo commented 2 years ago

Thank you very much for your answer ! I have asked for a big enough gpu to train the code however I get a weird error RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when callingcublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`` but it is related to my environment I think (even if I don't know how to solve this problem)...

vdobrovolskii commented 2 years ago

I suggest you run the same experiment on cpu, you might get a more understandable error message. If you don't get one at all, then you should try starting with a fresh CUDA installation

gvanboven commented 1 year ago

Hi! Thanks a lot for uploading your model. I am trying to train the model, but I also get an out of memory error. Would it perhaps be possible to elaborate a bit more on the solution you propose here? I could not directly identify the n_sequences in the code -- in what file could I change this to only taking 2 sequences per document? Thank you so much!