-
## Context
Currently once a crawl is stopped, that's it! Users cannot pick up where they left off which results in a few points of friction:
### Crawling Large Websites
When crawling a large…
-
### Description
Dont know if this would be more appropriate as a bug report or feature request but I noticed that I am unable to fire a gun while crawling using gsit. Would be nice if this can someho…
ADM8 updated
4 months ago
-
Hello! It is possible to crawl a titles, not just urls?
-
404 auf vor allem französischen Seiten:https://www.google.com/webmasters/tools/crawl-errors?siteUrl=http://www.clarat.org/&utm_source=wnc_655201&utm_medium=gamma&utm_campaign=wnc_655201&utm_content=ms…
Twiek updated
6 years ago
-
Ideally, crawling would be as easy as grabbing an RSS feed or reading semantic HTML. Sadly, a lot of websites have neither. Sometimes there is an RSS feed but it is not linked in the meta headers.
…
-
I should use twisted or gevent for async crawling
-
https://developers.google.com/webmasters/ajax-crawling/
http://help.yandex.ru/webmaster/robot-workings/ajax-indexing.xml
http://habrahabr.ru/post/144197/
-
Hello,
Lantern is a very useful tool for developers, and it helped me to identify many errors.
It always worked perfectly, until recently: it crawls only about 150 pages now (compared to thousands b…
ghost updated
4 years ago
-
Have DHT crawling as a source,
and/or possibly support this https://github.com/FlyersWeb/dhtbay
-
Hello,
It appears that the crawling failed when trying to download the DOI. However, the SciHub.st was able to download the DOI without any problem. It is likely that the issue is related to the craw…