roamanalytics / mittens

A fast implementation of GloVe, with optional retrofitting
Apache License 2.0
243 stars 32 forks source link

Save mittens object in the tensorflow implementation. #8

Closed saroufimc1 closed 5 years ago

saroufimc1 commented 6 years ago

If I try saving the trained model (GloVe object) with pickle, it fails because I used the tensorflow implementation. How should I save it?

glove = GloVe(max_iter=self.max_iter, n=self.embedding_dim, learning_rate=self.eta) G = glove.fit(data) trained_model = glove

with open(model_path, "w") as f: pickle.dump(self.trained_model, f)

File "glove_vectorizer.py", line 107, in save_model with open(model_path, "w") as f: _pickle.PicklingError: Can't pickle <class 'module'>: attribute lookup module on builtins failed

ndingwall commented 6 years ago

Sorry for the slow reply - github isn't sending notifications for some reason.

There isn't really a model to speak of - it's just matrix factorization. You can access the underlying W, C, bw and bc using e.g. glove_model.sess.run(glove_model.W). Unfortunately there's no way to initialize .fit with those matrices in the Tensorflow version, but you can in the Numpy version. The only things missing are the momentums used in the Adagrad update, but I'm not sure if they'd be preserved when saving models anyway.

Alternatively, you can just save the embeddings generated on your first run and initialize your second run with those same embeddings. That gets you almost the same thing (W and C will be approximately the piecewise mean of the original W and C, and you lose the biases bw and bc, and again you restart with no momentum).