-
Hi
I want to crawl http://www.kuaidaili.com/free/, unfortunately, it throws error message "http521". However, if I crawl a normal URL, such as: http://www.baidu.com, I can get the correct result.…
-
Examples:
* https://python-patterns.guide/
* https://danluu.com/ (for #239)
It should be relatively easy to have a retriever/parser pair that handles URLs like (newlines added for clarity):
…
-
This issue refers to the documentation [here](https://github.com/scrapinghub/portia/blob/master/docs/installation.rst)
```
You can run Portia with the command below:
docker run -i -t --rm -v :/ap…
-
The first parameter of `artoo.ajaxSpider` is,
> urlList array | function : the list of urls to request through ajax or, alternatively, a function taking as arguments the index of the iteration a…
-
shub's codebase contains a lot of deprecated commands/functions, and code that only serves backwards compatibility. I would like to aggressively remove this code to tidy up the codebase and improve th…
-
I'm currently trying to parse this string: **'23.18.05'**
My expectation would be that results are only valid for a setting like DATE_ORDER': 'YDM' and especially never try to parse the middle number…
-
[root@XXXXXX]# docker run -v ~/portia_projects:/app/data/projects:rw -p 9001:9001
The root cause I guess is this , **nginx: unrecognized service**
Hoping for more help, thx
+ action=
+ shif…
-
Would parsing thing like "yesterday" and "previous month" be something that's in the scope of this project?
LeonB updated
3 years ago
-
Hi there,
First of all, thanks for developing this code.
I'm having trouble with scrapy and the json items. I got it to scrape the pages I wanted and when I open the csv file it only comes with …
-
Hello,
In swedish there are 2 specific ways that dates are written very commonly that I have not seen support for. Specifically dates can be written like "den 24:e December" and "201226" (2020-12-2…