crawling-sites Search Results

1000+ results
for crawling-sites

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

mediacloud/rss-fetcher #41

Investigate crawling sitemaps

This issue is for discussion/documentation of crawling site maps for story URL discovery. I'm creating it in the rss-fetcher repo, since I think: 1. We will likely want the result to feed into the…

philbudne updated 2 months ago
8
1bl4z3r/hermit-V2 #66

[BUG] - 404 page should include a `noindex` meta

**Describe the bug** 404 page should not be indexed by the search engine. Per https://developers.google.com/search/docs/crawling-indexing/block-indexing, this can be done by adding a `noindex` meta…

xuhdev updated 5 months ago
4
dariusk/NaNoGenMo-2014 #1

Resources

This is an open issue where you can comment and add resources that might come in handy for NaNoGenMo. There are already a ton of resources on [the old resources thread](https://github.com/dariusk/NaN…

dariusk updated 9 years ago
55
FrancoisMentec/OpenCompare2 #3

Agressive conversion of strings into dates with CSV

Hi, The CSV (pasted below) is wrongly handled at some points. It seems that when a number like 2013 or 2012 appears in the cell value, then it is considered as a date. It is a bit agressive an…

FAMILIAR-project updated 7 years ago
1
butterproject/butter-desktop #584

[feature] blocklist support

Hi, I think it will be interesting to have blocklist supported in butter. (It is supported by webtorrent) If you don't already know, In short, a blocklist is a file containing lines of IP ranges…

Persei08 updated 7 years ago
18
heroku/heroku-buildpack-chrome-for-testing #13

Anyway to specify exact versions?

I use some hacky custom versions of Selenium that don't play well with latest versions of chrome/chrome driver. Anyway to specify exact older versions?

jstoxrocky updated 3 months ago
11
scrapy/scrapyd #221

scrapy's download concurrency limits do not apply to paralle…

Hi, I am running multiple spiders concurrently, all of them scraping the same domain. I would like to be able to limit the download rate to this domain using the DOWNLOAD_DELAY scrapy setting. The …

joaqo updated 3 months ago
6
WordPress/gutenberg #15160

Add option to remove h1 from heading block

This is currently possible with the classic editor through the `tiny_mce_before_init` filter. I know a decision was made to deemphasize H1 which was great. However it would be even better if ther…

myleshyson updated 1 month ago
22
bosondata/chrome-prerender #32

https problem on FreeBSD

I am running into issue when trying to render https based URL's. http works fine. Also chrome in headless can deal with https and spits out DOM, for example when I run: `chrome --headless --disable-gp…

beyondcreed updated 6 years ago
53
rakshasa/rtorrent #759

DHT returns nonexistent peers

I have this trackerless magnet. When I add it in Transmission, after a couple of minutes the DHT search succeeds, the client connects with the other peers and begins to download. When I load the same…

ghost updated 5 years ago
39

上一页 1...58 59 60 61 62 63 64...100 下一页

1000+ results for crawling-sites

1000+ results
for crawling-sites