aws-samples / retail-demo-store

AWS Retail Demo Store is a sample retail web application and workshop platform demonstrating how AWS infrastructure and services can be used to build compelling customer experiences for eCommerce, retail, and digital marketing use-cases
MIT No Attribution
726 stars 515 forks source link

Curious why some `featured products` are not present in `all products`. #639

Closed MustaphaU closed 1 month ago

MustaphaU commented 2 months ago

Hi,

I am curious why some of the featured products are not available in all products.

Specifically, 5 featured items appear to be missing from all products.

When I run this in Lab 4 of the Personalization workshop:

all_products_resp = requests.get('http://{}/products/all'.format(products_service_instance))
featured_products_resp = requests.get('http://{}/products/featured'.format(products_service_instance))

all_products = all_products_resp.json()
featured_products = featured_products_resp.json()

print(set(pd.DataFrame(featured_products).id) - set(pd.DataFrame(all_products).id))

It outputs the following IDs, implying these items are featured but not in all products:

{'2ad09e8e-fd41-4d29-953e-546b924d7cb8',
 '4bb66b8a-cf13-4959-87ce-ca506fa568a2',
 '6bd74f2d-90c0-4ca6-9663-f3bbe9bf405b',
 '6f04daee-7387-442f-bc99-a9b0072b29ce',
 'b87da3f8-9a3e-417d-abd7-16329c5be1ba'}
BastLeblanc commented 1 month ago

Hi,

You are right, the "all products" api actually doesn't return all the products because of a limit in the dynamodb scan operation.

all_products_resp = requests.get('http://{}/products/all'.format(products_service_instance))
featured_products_resp = requests.get('http://{}/products/featured'.format(products_service_instance))

all_products = all_products_resp.json()
featured_products = featured_products_resp.json()
print(len(all_products))

prints : 2028, but in the ddb table there is (currently, it might evolve) 2,466 items.

The fix would require to retrieve all data when doing the scan operation (as per paginating results doc https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Scan.html#Scan.Pagination )

https://github.com/aws-samples/retail-demo-store/blob/1079f9e56d934a90dc135c609e57fd7dd31c5631/src/products/src/products_service/db.py#L114

Can you let us know the impact on the work you are doing?

You are welcome to contribute with a PR for this.

MustaphaU commented 1 month ago

Thank you. The issue/ impact was side-by-side comparisons of the reranked and the unranked lists could not be done effectively since the length of 'reranked list' < 'unranked list' PR #642

BastLeblanc commented 1 month ago

PR #642 merged