Closed GameOfThrow closed 8 years ago
You should save vocab_processor
with model as well. I'm going to add explicit save/restore methods per #130 but you can just pickle it and unpickle when running at inference time.
pickle works well - thanks for the response.
I'm new to skflow, so this might be me being stupid; I've trained a RNN using the example data, and saved it using
classifier.save(model_path)
.I also dumped out the prediction results using:
pandas.DataFrame(classifier.predict(X_test)).to_csv
This all works fine and I have an accuracy of roughly 80%
Next I load in the existing model using classifier =
skflow.TensorFlowEstimator.restore(model_path)
and also the same testing file.I passed the same test file through the VocabularyProcessor and generating the np Array:
X_test = np.array(list(vocab_processor.transform(X_test)))
I then run the prediction again:
classifier.predict(X_test)
but the accuracy is now only around 35%but the result is quite a lot different from the result I got when training the model. Any one can help me with what's going on here?
EDIT--
After exploring, I found out it is the VocabularyProcessor - when I rerun my data, the vocabs are re-labelled from 1 to N instead of keeping the same vector labels (when I first ran the model). Is there a way I would correctly label my vocabularies when reloading a model file?