IBM / product-recommendation-with-watson-ml

Build a recommendation engine with Spark and Watson Machine Learning
https://developer.ibm.com/patterns/build-a-product-recommendation-engine-with-watson-machine-learning/
Apache License 2.0
45 stars 55 forks source link

modified varibles in get_product_counts_for_customer() function #26

Closed IamShivamJaiswal closed 5 years ago

IamShivamJaiswal commented 5 years ago

In the function get_product_counts_for_customer()

def get_product_counts_for_customer(cust_id):`
    cust = df_customer_products.filter('CUST_ID = {}'.format(cust_id)).take(1)
    fields = []
    values = []
    for row in customer:
        for product_col in product_cols:
            field = 'sum({})'.format(product_col)
            value = row[field]
            fields.append(field)
            values.append(value)
    return (fields, values)`

while iterating we're using customer variable which is not in the scope of this function and referring to global variable defined just above this cell and due to this for all different users it returns the same result. I modified the variable which is working now for each different customer

def get_product_counts_for_customer(cust_id):
    cust = df_customer_products.filter('CUST_ID = {}'.format(cust_id)).collect()
    fields = []
    values = []
    for row in cust:
        for product_col in product_cols:
            field = 'sum({})'.format(product_col)
            value = row[field]
            fields.append(field)
            values.append(value)
    return (fields, values)
ptitzler commented 5 years ago

Thank you for reporting the issue @IamShivamJaiswal! You are right, as is the code won't return the expected result. Unfortunately I couldn't just merge your PR because it introduced too many changes to the notebook source code. I therefore created (and merged) a PR based on your suggested fix https://github.com/IBM/product-recommendation-with-watson-ml/pull/28/files, which only changes the relevant line of code.