zihangdai / xlnet

XLNet: Generalized Autoregressive Pretraining for Language Understanding
Apache License 2.0
6.18k stars 1.18k forks source link

Performance issues in the program #286

Open DLPerf opened 3 years ago

DLPerf commented 3 years ago

Hello,I found a performance issue in the definition of parse_files_to_dataset , zihangdai/xlnet/blob/master/data_utils.py, dataset = dataset.cache().map was called without num_parallel_calls. I think it will increase the efficiency of your program if you add this.

Here is the documemtation of tensorflow to support this thing.

Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.