zihangdai / xlnet

XLNet: Generalized Autoregressive Pretraining for Language Understanding
Apache License 2.0
6.18k stars 1.18k forks source link

How to get the XLNet vocabulary from spiece.model file and store it to a .vocab file? #283

Open SambhawDrag opened 3 years ago

SambhawDrag commented 3 years ago

Hi, I was working on implementing the XLNet language model in Julia. It uses the sentence-piece as the tokenizer. I need the vocabulary for the tokenizer, which is stored inside spiece.model file.

I did refer to this Issue #121, but it only tells about modification of the model.

Could you tell me, how I should obtain the vocabulary in a .vocab format?