Open maziyarpanahi opened 4 years ago
Unfortunately, I couldn't find any solution. It seems for some reason (could be totally my mistake) the XLnet pre-trained models are not aware of their surrounding tokens. So no matter what you put before or after unlike BERT it will always generate the same vectors.
Hi,
I finally managed to use
get_sequence_output
to get word embeddings after dealing with random embeddings due to dropout, random seed, etc.However,
get_sequence_output()
doesn't seem to be contextualized. If you have a string that says 'Bank river.' and get the embeddings forBank
, and try another one withBank robber.', the embeddings for
Bankis identical in both tests. In BERT and other contextualized transformers, the
Bank` has a different vector since the context is not the same.I tried to play around with a mask, segments, etc. but it's always the same embeddings for a given word in different contexts. I followed the advice, some examples, etc. and the following is my configs:
Even though I've seen some examples using
0.1
for dropout like here, but they have random embeddings issue: https://github.com/amansrivastava17/embedding-as-service/tree/master/server/embedding_as_service/text/xlnet usingAre my XLNet config and run config correct to use the pre-trained weights/checkpoints?