Closed ahayamb closed 6 years ago
I would not recommend doing that. In hindsight, a service with some models in memory and responding to requests is a much simpler solution.
In my case latency is an issue so I make predictions for each user/lot, store the top_n recommendations for each user, and serve those results from an endpoint. The endpoint is essentially a dictionary lookup.
Thanks for the answers. I'm sorry for not stating my point clearer before. I've used lightfm to do prediction and it is fast enough. I have approximately 3gig model and serve it using uwsgi since some of python web frameworks doesn't support multi threaded request. If I spawn 4 uwsgi processes, it will consume approx. 12gig of RAM. I've tried to take out model's parameter and put it to postgres, but prediction time is increasing. Is there any strategies to overcome memory issue but still getting an acceptable prediction time? I'm sorry this is so demanding 😂
Yeah I missed that. I wouldn't know how to address that but would be interested to see what you come up with.
@ahayamb you can get the get_user/item_representations
methods to dump user and item embeddings and biases as numpy arrays. Once you have those the score for (user i, item j) is given by
score = np.dot(user_embeddings[i], item_embeddings[j]) + item_biases[j]
Since these are plain numpy arrays, you can keep them on disk and memmap them into the memory of each of your individual processes.
Since these are plain numpy arrays, you can keep them on disk and memmap them into the memory of each of your individual processes.
Great!!! Sounds interesting, I will try. Thanks for the suggestion 🎊
I've watched your talk and it sounds interesting to serve the model via postgres. If it's permissible to know, what are your use cases? How fast the prediction can be done? I've tried apache madlib to do array operation in postgres and want to see whether I did something wrong Please close this if you found it irrelevant, thank you