changun / CollMetric

A Tensorflow implementation of Collaborative Metric Learning (CML)
GNU General Public License v3.0
160 stars 61 forks source link

How to save the embedding matrix on CPU instead of GPU? #11

Closed LiuxyEric closed 6 years ago

LiuxyEric commented 6 years ago

Because the embedding matrix is too large and there are two embedding matrix. Every time I ran the scripts it return OOM error. How can I save the embedding matrix on CPU instead of GPU?

changun commented 6 years ago

Hi LiuxyEric,

Thank you for submitting the issue. I am wondering which dataset you are using, because the default dataset should be pretty small and can fit the most GPU.

You can follow the instructions here to force Tensorflow to run on CPU. https://github.com/tensorflow/tensorflow/issues/754

On Sun, Jun 3, 2018 at 2:31 AM LiuxyEric notifications@github.com wrote:

Because the embedding matrix is too large and there are two embedding matrix. Every time I ran the scripts it return OOM error. How can I save the embedding matrix on CPU instead of GPU?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/changun/CollMetric/issues/11, or mute the thread https://github.com/notifications/unsubscribe-auth/AA1Ff4FuIbb2WbAVJbXdDsyyQ6FBhQbiks5t46zjgaJpZM4UYAy0 .

LiuxyEric commented 6 years ago

@changun Thank u for your advice. I am using my own dataset to train the model. It is a very large dataset that contains 10M users. Any advice for such a large dataset?

changun commented 6 years ago

Hi LiuxyEric,

For a large dataset like that (from my experience of dealing with 13M users dataset, and assuming the # item is much smaller), I found it will be more efficient to set user vectors as the average of the item vector the user liked, instead of having a randomly initialized vector for each user.

This will require some modification to the CollMetric code, in particular, you probably will need to create a SparseTensor (BatchSize x # Items) that contains all the positive items of the users in the user-item pair, excluding the positive sample item used for gradient descent, and compute the user vector by tf.sparse_tensor_dense_matmul(sparse_tensor, item_embedding).

Such an approach will converge much faster, and requires only the GPU memory that have space to store item embedding and the batch under computation.

On Sun, Jun 3, 2018 at 4:16 AM LiuxyEric notifications@github.com wrote:

@changun https://github.com/changun Thank u for your advice. I am using my own dataset to train the model. It is a very large dataset that contains 10M users. Any advice for such a large dataset?

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/changun/CollMetric/issues/11#issuecomment-394154917, or mute the thread https://github.com/notifications/unsubscribe-auth/AA1Ff7Ew7TpMUdtu_f7r9jL6wY4P92wFks5t48WIgaJpZM4UYAy0 .

LiuxyEric commented 6 years ago

@changun Thank u so much! I'll try to do it.