The model saves a list of all the tokens in the vocabulary in save_dir/words.txt. If there's a case mismatch between the character model and the token model--that is, if you want the character model to be cased and the word vocabulary to be caseless--it reads through the training set to build up the character vocabulary. This is a problem when you only want to parse and the training set isn't available.
Solution: modify the code to save cased and caseless vocabularies in save_dir/words-cased.txt and save_dir/words-caseless.txt, and at parse time load whichever one is dictated by the cased configuration setting.
The model saves a list of all the tokens in the vocabulary in
save_dir/words.txt
. If there's a case mismatch between the character model and the token model--that is, if you want the character model to be cased and the word vocabulary to be caseless--it reads through the training set to build up the character vocabulary. This is a problem when you only want to parse and the training set isn't available.Solution: modify the code to save cased and caseless vocabularies in
save_dir/words-cased.txt
andsave_dir/words-caseless.txt
, and at parse time load whichever one is dictated by thecased
configuration setting.