adding --fp16 to run_language_modeling and increase batch size but cuda out of memory error

Hi all

I am using colab , 1 GPU , Tesla P100-PCIE-16GB

code below from hugging face ran OK

!python /content/transformers/examples/run_language_modeling.py --output_dir=/content/outputs --model_type=bert --model_name_or_path=bert-base-cased --num_train_epochs 1 --do_train --do_eval --per_gpu_train_batch_size 152 --train_data_file=/content/input_data/trn.txt --eval_data_file=/content/input_data/val.txt --evaluate_during_training --learning_rate 1e-4 --overwrite_output_dir --tokenizer_name /content/token/ --block_size 64 --mlm

(and batch_size 152 was max num i was able to run without cuda out of memory ) then installing apex by

%%writefile setup.sh

export CUDA_HOME=/usr/local/cuda-10.1 git clone https://github.com/NVIDIA/apex pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./apex

!sh setup.sh

then adding " --fp16" to code but i was not able to increase batch size , even abit

@julien-c , @ugent , @LysandreJik , @thomwolf do you know that ?

NVIDIA / apex

adding --fp16 to run_language_modeling and increase batch size but cuda out of memory error #773