wang001 commented 4 years ago

尝试过bert-wwm（哈工大提供的），在keras上面可以跑batch_size=6，max_seq=512。但是，在这里面只能跑batch=1，seq_len=300（我试过）版本、环境信息： 1）PaddlePaddle版本：1.5.0.post87 2）CPU：i5 9400f 3）GPU：1080ti，通过anaconda安装的cudatoolkit 8.0，cudnn7.1.4 4）系统环境：win10专业版，64位，Python版本 3.6.8

模型信息 1）模型名称：bert 2）datafountain-ccf-情感分析比赛 3）使用算法名称 4）模型链接：https://github.com/PaddlePaddle/models/tree/develop/PaddleNLP/language_representations_kit/BERT
复现信息：将哈工大提供的bert-wwm通过上面代码库中convert_params.py转为paddle模型，然后设置max_seq=512，batch=1，运行即可。
问题描述：在max_seq=300，batch_size=1时发现可以正常运行，但是我在keras上面可以设置batch=6，max_seq=512。显存利用存在很大的问题。

sneaxiy commented 4 years ago

能否提供具体的报错信息？

wang001 commented 4 years ago

只有一个这个退出的错误码，应该是因为windows下面的缘故，但是在调小batch_size和seq后可以运行，应该是因为显存不足。

sneaxiy commented 4 years ago

是哪个退出的错误码？能否提供下？

wang001 commented 4 years ago

0xC0000409

sneaxiy commented 4 years ago

请问还有什么其他错误现象吗？比如退出时的界面或堆栈信息？

wang001 commented 4 years ago

没有，windows下面一直没有支持堆栈吧，如果你了解nlp的话，可以随便拿个数据集试试吧

wang001 commented 4 years ago

E:\Anaconda3\envs\paddle\python.exe E:/worksapce/pyWorkSpace/PaddleNLP_0903/run_Sentiment_Analysis.py --kfold 10 --use_cuda true --batch_size 1 --in_tokens false --init_pretraining_params F:/pretrain/chinese_wwm_ext_L-12_paddle --data_path E:/corpus/chinaMobile/sentiment_pair_raw_fold_left512.tsv --vocab_path F:/pretrain/chinese_wwm_ext_L-12_H-768_A-12/vocab.txt --checkpoints F:/pretrain/checkpoints_roberta --save_steps 1000 --weight_decay 0.01 --warmup_proportion 0.0 --validation_steps 100 --epoch 1 --max_seq_len 512 --bert_config_path F:/pretrain/chinese_wwm_ext_L-12_H-768_A-12/bert_config.json --learning_rate 1e-5 --skip_steps 10 --num_iteration_per_drop_scope 10 --verbose true ----------- Configuration Arguments ----------- batch_size: 1 bert_config_path: F:/pretrain/chinese_wwm_ext_L-12_H-768_A-12/bert_config.json checkpoints: F:/pretrain/checkpoints_roberta data_path: E:/corpus/chinaMobile/sentiment_pair_raw_fold_left512.tsv do_lower_case: True enable_ce: False epoch: 1 in_tokens: False init_checkpoint: None init_pretraining_params: F:/pretrain/chinese_wwm_ext_L-12_paddle kfold: 10 learning_rate: 1e-05 loss_scaling: 1.0 lr_scheduler: linear_warmup_decay max_seq_len: 512 num_iteration_per_drop_scope: 10 random_seed: 0 save_steps: 1000 shuffle: True skip_steps: 10 use_cuda: True use_fast_executor: False use_fp16: False validation_steps: 100 verbose: True vocab_path: F:/pretrain/chinese_wwm_ext_L-12_H-768_A-12/vocab.txt warmup_proportion: 0.0 weight_decay: 0.01

attention_probs_dropout_prob: 0.1 directionality: bidi hidden_act: gelu hidden_dropout_prob: 0.1 hidden_size: 768 initializer_range: 0.02 intermediate_size: 3072 max_position_embeddings: 512 num_attention_heads: 12 num_hidden_layers: 12 pooler_fc_size: 768 pooler_num_attention_heads: 12 pooler_num_fc_layers: 3 pooler_size_per_head: 128 pooler_type: first_token_transform type_vocab_size: 2 vocab_size: 21128

Device count: 1 Num train examples: 6619 Max train steps: 6619 Num warmup steps: 0 Theoretical memory usage in training: 6787.632 - 7110.853 MB Load pretraining parameters from F:/pretrain/chinese_wwm_ext_L-12_paddle. WARNING:root: You can try our memory optimize feature to save your memory usage:

create a build_strategy variable to set memory optimize option

     build_strategy = compiler.BuildStrategy()
     build_strategy.enable_inplace = True
     build_strategy.memory_optimize = True

     # pass the build_strategy to with_data_parallel API
     compiled_prog = compiler.CompiledProgram(main).with_data_parallel(
         loss_name=loss.name, build_strategy=build_strategy)

 !!! Memory optimize is our experimental feature !!!
     some variables may be removed/reused internal to save memory usage, 
     in order to fetch the right value of the fetch_list, please set the 
     persistable property to true for each variable in fetch_list

     # Sample
     conv1 = fluid.layers.conv2d(data, 4, 5, 1, act=None) 
     # if you need to fetch conv1, then:
     conv1.persistable = True

train pyreader queue size: 50, learning rate: 0.000010 epoch: 0, progress: 54/6619, step: 0, ave loss: 0.7352414727210999, ave acc: 1.0 Check failure stack trace:

Process finished with exit code -1073740791 (0xC0000409)

PaddlePaddle / models

models里面的bert训练时异常退出（很大可能是显存不足），报0xC0000409 #3306

create a build_strategy variable to set memory optimize option