Closed ruivieira closed 6 years ago
@elmiko ptal, thank you!
thanks @ruivieira! this looks like a nice fix for now. it would be ideal if we could return an error on that initial request, but i realize that may not be feasible given our current code paths.
maybe in the future we should upgrade the responses to have some sort of error reporting mechanism through the call to get the prediction. like, instead of simply returning an empty list, we could just return an error object or something. i dunno, we probably need to think more about that.
closes #27
When requesting top-k predictions for an invalid user
id
(see #27),MatrixFactorizationModel
crashes and Spark ALS stops providing predictions.Since Spark's ALS model has no built-in facility to catch this, a safeguard must be built at the application logic level. The solution proposed in this PR consists of:
RDD
of userid
s and another of productid
sModel
class to implement two methods (valid_user(id)
andvalid_product(id)
) to check if the providedid
s are within the modelid
s are checked for validity and if they are not valid, the predicted top-k product list will be empty, bypassing prediction.This prevents crashing the ALS model at the cost of one call to an
RDD
ofid
s.Example (after patching):
Using user
id=900000
(invalid) we issuewe get the server response
the log alerts us:
Fetching the prediction with
Returning the response:
but without crashing the Spark model.