modelscope / AdaSeq

AdaSeq: An All-in-One Library for Developing State-of-the-Art Sequence Understanding Models
Apache License 2.0
412 stars 38 forks source link

[Question] CUDA out of memory #27

Open Gelassen opened 1 year ago

Gelassen commented 1 year ago

What is your question?

How much GPU memory is required to train a BERT model?

For start I have this commands from your readme file adaseq train -c demo.yaml and faced with Out of memory error.

OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 1.96 GiB total capacity; 1.09 GiB already allocated; 4.81 MiB free; 1.14 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

However it just a first question in the queue of many. I will ask them here as at this moment this issue blocks me to verify them and find answer on my own.

Does your pytorch end models could be converted in tensorflow lite models? (I need this format for import on Android device)

Does your babert model provides fully trained model to split chinese text on words (Word Segmentation)?

Is this checkpoint dataset (chinese-babert-base.tar) just for validation of the model?

What have you tried?

$pip install adaseq
$adaseq train -c demo.yaml

Code (if necessary)

No response

What's your environment?

Code of Conduct

Gelassen commented 1 year ago

I have found an answer for a first question here. Seems it requires at least 12 Gb of GPU RAM.