-
This issue is for discussion/documentation of crawling site maps for story URL discovery.
I'm creating it in the rss-fetcher repo, since I think:
1. We will likely want the result to feed into the…
-
**Describe the bug**
404 page should not be indexed by the search engine. Per https://developers.google.com/search/docs/crawling-indexing/block-indexing, this can be done by adding a `noindex` meta…
-
This is an open issue where you can comment and add resources that might come in handy for NaNoGenMo.
There are already a ton of resources on [the old resources thread](https://github.com/dariusk/NaN…
-
Hi,
The CSV (pasted below) is wrongly handled at some points.
It seems that when a number like 2013 or 2012 appears in the cell value, then it is considered as a date.
It is a bit agressive an…
-
Hi,
I think it will be interesting to have blocklist supported in butter. (It is supported by webtorrent)
If you don't already know, In short, a blocklist is a file containing lines of IP ranges…
-
I use some hacky custom versions of Selenium that don't play well with latest versions of chrome/chrome driver. Anyway to specify exact older versions?
-
Hi, I am running multiple spiders concurrently, all of them scraping the same domain. I would like to be able to limit the download rate to this domain using the DOWNLOAD_DELAY scrapy setting.
The …
joaqo updated
3 months ago
-
This is currently possible with the classic editor through the `tiny_mce_before_init` filter.
I know a decision was made to deemphasize H1 which was great. However it would be even better if ther…
-
I am running into issue when trying to render https based URL's. http works fine. Also chrome in headless can deal with https and spits out DOM, for example when I run: `chrome --headless --disable-gp…
-
I have this trackerless magnet. When I add it in Transmission, after a couple of minutes the DHT search succeeds, the client connects with the other peers and begins to download.
When I load the same…
ghost updated
5 years ago