Closed Lucas-Armand closed 2 years ago
If I try to access 'http://doweb.rio.rj.gov.br/ler_pdf.php?download=ok&edi_id=3864' (for example) I access only the "first page" of the diário ofícial...
A very similar result for the other url addresses ...
I do not know if I'm doing something wrong, or something in the DO server service change but I would like an orientation...
Thanks!
Hi @Lucas-Armand,
From your messages I got the impression you're trying to read Scrapy output as the results of the crawler. Please, correct me if I'm wrong.
Scrapy output are just logs to give you an idea about what's going on. The proper results are stored in the PostgreSQL. Have you checked this database too?
Considering that we changed how the project is structured and how to run spiders locally, this can be closed.
Hello guys.
I'm having trouble understanding the crawler results from rio de janeiro ...
If I test the crawler of rio de janeiro (following the orientation of CONTRIBUTING.md):
The result seems to be wrong:
[...]
When I run the crawler of Porto Alegre (for comparison) I get an intelligible result:
result is: [...]