Closed igorsimb closed 9 months ago
Currently here's the flow:
ScrapeForm
scrape_items
creates a string of SKUsscrape_items_from_skus
loops over SKUs sending each SKU toscrape_item
that returns a dict with item info back toscrape_items_from_skus
that creates a big list of dicts with all the scraped items and returns toscrape_items
updates db with info from this big list.We need an additional step between 3 and 4:
3.5. util:check_for_valid_sku
within scrape_item
util. If SKU is valid, keep going to 4; if SKU is not valid, return this invalid: {sku}
.
scrape_items_from_skus
should have an invalid_skus
list and check if scrape_item
returned invalid: {sku}
, add it to the list and DO NOT add it to item_data
list.
Then we display invalid_skus
list to user somehow.
There are 2 ways of solving this:
Use regex to check for valid SKU before making the API request. That way we don't have to make API requests at all. Create a function (see below) and call it at the very beginning of scrape_item
util as a guard rail.
def is_sku_valid(sku: str) -> bool:
"""Check if the SKU is valid.
Args:
sku (str): The SKU to check.
Returns:
bool: True if the SKU is valid, False otherwise.
"""
# only numbers between 5-12 characters long
if re.match(r"^[0-9]{5,12}$", sku):
return True
else:
return False
At the start of scrape_item
util:
if not is_sku_valid:
logger.error("Invalid SKU: %s", sku)
return {}
Implement a check for an empty list before this line: item = data.get("data", {}).get("products")[0]
# non-existing SKU requests return and empty products list
logger.info("Checking if SKU '%s' is valid...", sku)
if not data.get("data", {}).get("products"):
logger.error("SKU '%s' is invalid!", sku)
return {}
item = data.get("data", {}).get("products")[0]
I think we should do either number 1, or both checks.
Both checks need to be implemented:
IndexError: list index out of range
Next step: how do we handle the incorrect SKUs and messaging to user?
Somewhat related to https://github.com/igorsimb/mp-monitor/issues/18
If we try to scrape non-existing SKU, we are getting
IndexError: list index out of range
and, as a result, none of the intered SKUs will be scraped.Need to implement a check for each SKU, scrape existing ones, and then show an error like so "The following SKUs do not exist:". So there should be a way to store the non-existing SKUs to show on screen.
TestScrapeItemsFromSKUs::test_with_valid_skus
,TestScrapeItemsFromSKUs::test_with_skus_separated_by_commas_and_spaces
) Reason for failures:scrape_items_from_skus
now returns a tuple, adjust assertion accordinglyAdd tests for:
is_sku_format_valid
show_invalid_skus_message
scrape_items_from_skus
appends toinvalid_skus
ifInvalidSKUException
is raisedInvalidSKUException
inscrape_item
, specifically