CLUEbenchmark / CLUE

中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
http://www.CLUEbenchmarks.com
4.02k stars 540 forks source link

Performance issues in the program #122

Open DLPerf opened 3 years ago

DLPerf commented 3 years ago

Hello,I found a performance issue in the definition of parse_files_to_dataset , CLUEbenchmark/CLUE/blob/master/baselines/models/xlnet/data_utils.py, dataset.cache().map(parser) was called without num_parallel_calls. I think it will increase the efficiency of your program if you add this in map.

Here is the documemtation of tensorflow to support this thing.

Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.