-
In the Nutch Indexer, detect the Tokenizer give not valid tokens when that word appears at the end of document.
-
I created a crawl, Nutch Crawl, added a seed file, gave it a name, selected Nutch. Then I tried to delete the crawl and clicked Yes and got:
```
TypeError
TypeError: get_crawl() takes exactly 2 argum…
-
In the commit 2b83b7ecebf55f27146036e62aaa5e97a59c8f33, was refactor the nextToken method. This was tested on Nutch Indexer, but don't do the same work on D&G branch.
-
```
What steps will reproduce the problem?
Running the crawler crashes the JVM some times. I crawl around 10 web sites
regularly with pages between 1K to 50K. This happens randomly but happens very …
-
```
What steps will reproduce the problem?
Running the crawler crashes the JVM some times. I crawl around 10 web sites
regularly with pages between 1K to 50K. This happens randomly but happens very …
-
```
Dobrý večer, píšu ohledně té pomoci s bránou. Jedná se o bránu poslatSMS. Už se tu o
ní psalo a teď je opravdu spolehlivá.
Při posílání POST dat dojde k přesměrování (kód 302). SMS se mi vždy odeš…
-
```
Method call returns "false":
DEBUG - SiteMapParser - 169.
url="http://www.classmates.com/sitemaps/publicprofile/publicprofile-sitemap-1999
0716-0000.xml.gz",lastMod=2009-10-21T00:…
-
```
What steps will reproduce the problem?
Running the crawler crashes the JVM some times. I crawl around 10 web sites
regularly with pages between 1K to 50K. This happens randomly but happens very …
-
```
What steps will reproduce the problem?
Running the crawler crashes the JVM some times. I crawl around 10 web sites
regularly with pages between 1K to 50K. This happens randomly but happens very …
-
I thought these files (schema.xml, solrconfig.xml) are located in home/nutch/conf. But after editing the files, I reloaded collection1 in the core admin tab in the solr admin panel. And I don't see ch…