Closed simone-trubian closed 8 years ago
After implementing error printing the following errors were logged:
failed to fetch eu.banggood.com/Wholesale-Warehouse-2Pcs-Black-Union-Jack-Flag-Vinyl-Mirrors-Stickers-For-Mini-Cooper-wp-Eu-985030.html because of InvalidUrlException "eu.banggood.com/Wholesale-Warehouse-2Pcs-Black-Union-Jack-Flag-Vinyl-Mirrors-Stickers-For-Mini-Cooper-wp-Eu-985030.html" "Invalid URL"
failed to fetch eu.banggood.com/Wholesale-Warehouse-300pcs-M3-Nylon-White-Hex-Screw-Nut-Spacer-Stand-off-Varied-Length-Assortment-Kit-Box-wp-Eu-984548.html because of InvalidUrlException "eu.banggood.com/Wholesale-Warehouse-300pcs-M3-Nylon-White-Hex-Screw-Nut-Spacer-Stand-off-Varied-Length-Assortment-Kit-Box-wp-Eu-984548.html" "Invalid URL"
failed to fetch eu.banggood.com/Wholesale-Warehouse-A4-30X20cm-Grid-Self-Healing-Cutting-Craft-Mat-Engraving-Board-Double-Sided-wp-Eu-986712.html because of InvalidUrlException "eu.banggood.com/Wholesale-Warehouse-A4-30X20cm-Grid-Self-Healing-Cutting-Craft-Mat-Engraving-Board-Double-Sided-wp-Eu-986712.html" "Invalid URL"
failed to fetch eu.banggood.com/Wholesale-Warehouse-6-Inch-150mm-Electronic-Mini-Digital-Caliper-Micrometer-Guage-Ruler-wp-Eu-41970.html because of InvalidUrlException "eu.banggood.com/Wholesale-Warehouse-6-Inch-150mm-Electronic-Mini-Digital-Caliper-Micrometer-Guage-Ruler-wp-Eu-41970.html" "Invalid URL"
Going through the latest JSON file it was noticed that some item objects contain bad URL's, for example:
{
"item_name": "6 Inch 150mm Electronic Mini Digital Caliper Micrometer Guage Ruler",
"source_url": "eu.banggood.com/Wholesale-Warehouse-6-Inch-150mm-Electronic-Mini-Digital-Caliper-Micrometer-Guage-Ruler-wp-Eu-41970.html",
"ebay_url": "http://www.ebay.it/itm/Calibro-Digitale-Elettronico-0-150mm-6-alta-precisione-strumenti-misura-/152097499307?ssPageName=STRK:MESE:IT"
},
{
"item_name": "6 Inch 150mm Electronic Mini Digital Caliper Micrometer Guage Ruler",
"source_url": "eu.banggood.com/Wholesale-Warehouse-6-Inch-150mm-Electronic-Mini-Digital-Caliper-Micrometer-Guage-Ruler-wp-Eu-41970.html",
"ebay_url": "http://www.ebay.it/itm/Calibro-Digitale-Elettronico-0-150mm-6-alta-precisione-strumenti-misura-/152097499307?ssPageName=STRK:MESE:IT"
}
Incidentally those items are the one which availability cannot be updated.
Refer to ticket #63 Mark resolved when all items with an existing page can be updated.
Resolved by ticket #63
Issue
The new fetching and scraping functions do not seem to work all the times: some BangGood produce a Nothing at the end of the processing pipeline, but if fetched from the REPL the updating process works correctly.
Possible causes
The processing pipeline can fail in a few points:
Suggested strategy
First enable a very simple error printing in the "catch" branch of the
catch
function. If that isn't sufficient in shedding any light over the issue, overhaul the entire Http and HTML modules to use Either instead of Maybe and log the left branch as well as any HttpException.