-
Try to run sample script in documentation but got this error: module 'scrapy' has no attribute 'Spider'
```
import scrapy
class BlogSpider(scrapy.Spider):
name = 'blogspider'
start_u…
-
I believe for new users it would be useful if you provided more full examples of spiders in the documentation e.g. working crawler. The documentation is detailed but with the various files that need t…
-
https://github.com/AICoE/prometheus-api-client-python/blob/v0.0.2/prometheus_api_client/metric.py#L146-L148
Here, sometimes the dateparser parses date in a yyyy-dd-mm format which is incorrect.
We c…
4n4nd updated
5 years ago
-
Today I stumbled on one bug that results from somewhat unusual behavior when redirecting to urls containing "#" hash sign.
I have an url http://groceries.asda.com/asda-webstore/landing/home.shtml#se…
-
### What is your Test Scenario?
I am trying to use TestCafe with proxies, specifically proxies coming from the service scrapinghub.com and their product 'crawlera'. It's supposed to produce a br…
-
Seems that `extruct` incorrectly interprets description with included HTML tags from microdata.
See the below description extracted from URL https://www.monsterpetsupplies.co.uk/cat/cat-flea-tick/j…
-
-
The validation monitor `check_missing_required_fields`, `check_missing_required_fields_percent`, `check_fields_errors` and `check_fields_errors_percent` monitors only raises error for the first effect…
-
`JsonSchemaItem` is a subclass of `scrapy.item.DictItem` while a recent enough `HubstorageExtension` checks whether an item is a `scrapy.Item` (which is a subclass of `DictItem` too): https://github.c…
-
I run scrapy-splash with docker on ubuntu 14.04
I receive many "QHttpNetworkConnectionPrivate::_q_hostLookupFinished could not dequeu request" in console
and the memory used increase without stopp…