[Question] CUDA out of memory

What is your question?

How much GPU memory is required to train a BERT model?

For start I have this commands from your readme file adaseq train -c demo.yaml and faced with Out of memory error.

OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 1.96 GiB total capacity; 1.09 GiB already allocated; 4.81 MiB free; 1.14 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

However it just a first question in the queue of many. I will ask them here as at this moment this issue blocks me to verify them and find answer on my own.

Does your pytorch end models could be converted in tensorflow lite models? (I need this format for import on Android device)

Does your babert model provides fully trained model to split chinese text on words (Word Segmentation)?

Is this checkpoint dataset (chinese-babert-base.tar) just for validation of the model?

What have you tried?

$pip install adaseq
$adaseq train -c demo.yaml

Code (if necessary)

No response

What's your environment?

AdaSeq Version (e.g., 1.0 or master):
ModelScope Version (master):
PyTorch Version (default one, supplied with adaseq):
OS (Ubuntu 22.04.2 LTS):
Python version: Python 3.10.12
CUDA/cuDNN version: NVIDIA-SMI 470.199.02 Driver Version: 470.199.02 CUDA Version: 11.4
GPU models and configuration: NVIDIA GeForce GTX 860M
Any other relevant information:

Code of Conduct

[X] I agree to follow this project's Code of Conduct

modelscope / AdaSeq