How much GPU memory is required to train a BERT model?
For start I have this commands from your readme file adaseq train -c demo.yaml and faced with Out of memory error.
OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 1.96 GiB total capacity; 1.09 GiB already allocated; 4.81 MiB free; 1.14 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
However it just a first question in the queue of many. I will ask them here as at this moment this issue blocks me to verify them and find answer on my own.
Does your pytorch end models could be converted in tensorflow lite models? (I need this format for import on Android device)
Does your babert model provides fully trained model to split chinese text on words (Word Segmentation)?
What is your question?
How much GPU memory is required to train a BERT model?
For start I have this commands from your readme file
adaseq train -c demo.yaml
and faced with Out of memory error.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 1.96 GiB total capacity; 1.09 GiB already allocated; 4.81 MiB free; 1.14 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
However it just a first question in the queue of many. I will ask them here as at this moment this issue blocks me to verify them and find answer on my own.
Does your pytorch end models could be converted in tensorflow lite models? (I need this format for import on Android device)
Does your babert model provides fully trained model to split chinese text on words (Word Segmentation)?
Is this checkpoint dataset (chinese-babert-base.tar) just for validation of the model?
What have you tried?
Code (if necessary)
No response
What's your environment?
Code of Conduct