Performance issues in the program

Hello,I found a performance issue in the definition of parse_files_to_dataset , CLUEbenchmark/CLUE/blob/master/baselines/models/xlnet/data_utils.py, dataset.cache().map(parser) was called without num_parallel_calls. I think it will increase the efficiency of your program if you add this in map.

Here is the documemtation of tensorflow to support this thing.

Looking forward to your reply. Btw, I am very glad to create a PR to fix it if you are too busy.

CLUEbenchmark / CLUE

Performance issues in the program #122