subdomain-crawler Search Results

262 results
for subdomain-crawler

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

simplecrawler/simplecrawler #165

parseScriptTags not working properly

If the option `parseScriptTags` is set to false simplecrawler only crawls the first page and stops. I recognized this behavior with the Content Management System [Contao](https://contao.org/de/). It c…

lgraubner updated 9 years ago
4
crawler-commons/crawler-commons #81

The implementation ignores URLs on different subdomains due …

The current implementation assumes that a URL is legal only if the sitemap URL is a substring of the URL. This doesn't hold for some websites such as nytimes.com in which the sitemaps are actually on …

suhdev updated 9 years ago
7
Norconex/crawlers #135

One URL COMMITED several times in a crawler run

After running a crawler with `3` and just one URL, I have analysed the log and noticed that several URL are processed several times via the events: `DOCUMENT_FETCHED, CREATED_ROBOTS_META, URLS_EXTRAC…

csaezl updated 9 years ago
13
Norconex/crawlers #131

Several URLs processed one by one

I'm not sure if this can be done with the crawler. It's very simple. I need to process several URLs (say 100+ and more) each one with its own filter. If I've understood your crawler behaviour, the fol…

csaezl updated 9 years ago
21
oyvindeh/ucss #37

Does not crawl if config provided

`ucss -h http://www.host.com -c http://www.host.com/path/to.css` crawls the page and outputs the urls as expected. however there are links to subdomains and youtube so this option is not suitable. `u…

ghostrydr updated 9 years ago
8
jasonbhart/mmh #159

Meeting title should be included in sharing functions

jasonbhart updated 8 years ago
21
nexcess/magento-turpentine #829

Turpentine VCL does not make Varnish generate required cooki…

It seems that every time Varnish is restarted, the sessions are lost. Customer shopping carts are lost as a result of this. Is this normal behaviour for Varnish?

Petce updated 9 years ago
16
hotosm/hotosm-website #31

Set up staging site

If we work with several people we need an additional staging site to test commits, and to allow other to see whether the proposed change solves the issue before pushing it to the production site.

ifrik updated 9 years ago
7
sjdirect/abot #59

internal / external link detection is wrong?

Function _isInternalDecisionMaker falsely detects that the link is external ``` protected Func _isInternalDecisionMaker = (uriInQuestion, rootUri) => uriInQuestion.Authority == rootUri.Authority; ```…

LeMoussel updated 9 years ago
1
gt-big-data/retina-crawler #6

[API] Create Output that dumps docs into MongoDB

Crawler: Add support for storing crawled docs into MongoDB. The Collection data format should look like something like this, based on what we crawl and what the UI needs: - title - text - articleDate…

pviolette3 updated 9 years ago
3

上一页 1...21 22 23 24 25 26 27...27 下一页

262 results for subdomain-crawler

262 results
for subdomain-crawler