Closed sohailoo closed 3 years ago
these re the links i'm trying to scrape
--- refresh_interval: 2 # seconds urls: - - https://www.amazon.com/ASUS-Graphics-DisplayPort-Military-Grade-Certification/dp/B08HH5WF97 - https://www.amazon.com/GeForce-Gaming-Graphics-Technology-Backplate/dp/B08NWBG14N - https://www.amazon.com/MSI-GeForce-RTX-3080-10G/dp/B08HR7SV3M - https://www.amazon.com/ASUS-Graphics-DisplayPort-Axial-tech-2-9-Slot/dp/B08J6F174Z ...
D2020-12-03 02:21:49,882 using parser: html.parser D2020-12-03 02:21:49,884 registering custom scraper for domain: amazon D2020-12-03 02:21:49,890 registering custom scraper for domain: bestbuy D2020-12-03 02:21:49,896 registering custom scraper for domain: bhphotovideo D2020-12-03 02:21:49,902 registering custom scraper for domain: microcenter D2020-12-03 02:21:49,910 registering custom scraper for domain: newegg E2020-12-03 02:21:50,006 caught exception Traceback (most recent call last): File "/src/run.py", line 41, in main config = parse_config(args.config) File "/src/config.py", line 37, in parse_config return Config(refresh_interval, max_price, data['urls']) File "/src/config.py", line 23, in __init__ self.urls = [URL(url) for url in sorted(set(urls))] TypeError: '<' not supported between instances of 'NoneType' and 'str'
The line that follows urls: is a blank entry. I would delete this line and try again.
urls:
these re the links i'm trying to scrape