bartmachielsen / SupermarktConnector

Collecting product information from Dutch supermarkets: Albert Heijn and Jumbo using the Mobile API
http://www.bartmachielsen.nl
MIT License
123 stars 22 forks source link

size > 30 for JumboConnector causes error 500 #12

Closed Aypac closed 3 years ago

Aypac commented 3 years ago

Thanks for the lib, very useful! It appears that jumbo has limited the size to 30 per page. This leads to this command throwing a 500 error: list(connector.search_all_products(query='Olijfolie')). I tried adding the size=30 parameter to the function (against all warnings), but it doesn't seem to work. For now I wrote my own workaround:

def get_all(query, connector):
    all = []
    scraped_prod_num = 0
    all_prod_num = 99
    page = 0
    while scraped_prod_num < all_prod_num:
        res = connector.search_products(query=query, size=30, page=page)
        all_prod_num = int(res['products']['total'])
        pd = res['products']['data']
        all += pd
        scraped_prod_num += len(pd)
        assert len(all) == scraped_prod_num
        if page == 0:
            print('')
        print(f"\rFound {res['products']['total']:d} products. Retrieved {scraped_prod_num:d}", end="")
        page += 1
        time.sleep(random.random()*1.2)
    print('. Done.')
    return all

But I think it'd be nice for it to be fixed in the code ;)

Error: requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: https://mobileapi.jumbo.com/v15/search?offset=0&limit=1000&q=Olijfolie and the same for any request with size > 30.

bartmachielsen commented 3 years ago

Hi @Aypac, I just updated the library to set the default pagination to 30, Update to 0.5 to get the new version.