custom-crawler Search Results

1000+ results
for custom-crawler

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

magicpanda/crawler4j #164

setURL can crash and burn in the case of malformed URLs or w…

``` What steps will reproduce the problem? 1. Create a web-page with a malformed URL (or a protocol like mailto:) 2. Run the crawler on said website. 3. Crash and burn at line 89 in WebURL.java - this…

GoogleCodeExporter updated 8 years ago
1
tosdr/tosback2 #19

Conflicting requirements for mime-types gem

I installed master/45995736 on OS X 10.8.5 today. I'm using ruby 1.9.3-p429 via rbenv, and when I try to run the crawler from the "rubycode" directory, I get a gem specification failure: ``` $ ruby m…

irons updated 10 years ago
3
uniprot/enzymeportal #203

Results unrelated to query

This seems to happen randomly. Today I searched for _alzheimer_ and got results for _ehlers-danlos syndrome_. Last week I made a chemical structure search and got results for _cadasil_ (look at the lo…

rafael-alcantara updated 10 years ago
3
scrapy/scrapy #4043

Scrapy download middelware should pause on Connection loss …

## Summary We should make Scrapy downlaoder middleware pause when Internet connection is lost, and wait until it is back to resume the downloader middleware. ## Motivation Currently, on a con…

royahsan updated 4 years ago
5
B3ST/B3 #12

Implement a theme fallback strategy

What happens when Google tries to crawl a site that uses a B3-based theme? What if a user's browser doesn't support JavaScript? The theme should be able to detect these situations and serve the user…

goblindegook updated 9 years ago
6
dillbyrne/random-agent-spoofer #369

Addition of Googlebot profile to circumnavigate internet pay…

Details on accessing web content behind paywalls... http://www.ghacks.net/2016/02/26/read-articles-behind-paywalls-by-masquerading-as-googlebot/ It references two addons, RefControl and User Agent S…

sueridgepipe updated 8 years ago
4
algolia/docsearch #1437

v3 Configuration problems

How do I need to configure to jump to the local station？ At present, my link is like this ![my](https://user-images.githubusercontent.com/23733037/179547780-624b0d92-4224-44a7-8ca1-c04b34c8f6c9.…

RayShineHub updated 2 years ago
6
Navaminavu/enavami #1

Norconex HTTP collector

Hi, iam trying to set up anew norconex connector for a page. Here iam having trouble with reading URLs in the page like those in teh index bar ![image](https://user-images.githubusercontent.com/29…

Navaminavu updated 7 years ago
1
openzim/zimit #414

flibusta.is: some pages are not downloaded

Hi, I downloaded https://flibusta.is using your Docker examples from the README, around 90 GB. And I see that some links of the same type are not fetched - they have absolute URLs, open Firefox on cli…

vitaly-zdanevich updated 3 weeks ago
12
dotnet/roslyn #22214

Document how to turn on diagnostic analyzers for a custom wo…

**Version Used**: 8f02e04893 **Steps to Reproduce**: 1. Have a custom workspace of type "Foo" 2. Try to enable diagnostic tagger for the documents in the workspace **Expected Behavior**: …

KirillOsenkov updated 6 years ago
1

上一页 1...27 28 29 30 31 32 33...100 下一页

1000+ results for custom-crawler

1000+ results
for custom-crawler