-
For openDap services, some may require credentials. User shouldn't have to place these in the code. Instead, they should be able to place them in a file, and the code extracts username and pwd from th…
-
Hi,
I've activated the offsite middleware in my project.
It works when I add the allowed_domains property to my spider.
But I have transitioned to loading my start urls with a seedlist using the Fi…
-
http://extras.evolution-cms.com/packages/core/security-fix.html
fast solution for fix.
-
In researching a bug in Bixo, I realized that the SimpleHttpFetcher needs to be serializable so that we can easily use it with Hadoop jobs. But that's an odd dependency, and in researching how we use …
-
Why url, parsed text, content is not indexing in solr when crawldb update, as sparkler flow diagram shows on updation?
-
**Elasticsearch version**: 2.3.2
**JVM version**:1.7
**OS version**: Windows 8
**Description of the problem including expected versus actual behavior**: This was working well in Elasticsearch 2…
-
```
(memex-explorer)cdoig@066-cdoig:~$ crawl ~/work/memex/memex/court_docs/raw_data_no_comments/seed_dir/ ~/work/memex/memex/court_docs/crawl_test 4
JAVA_HOME is set to '/Library/Java/JavaVirtualMach…
-
```
We have loads of fine grained method available to us via FetchedResult.
I think it would be really cool however if we were able to print a report of
the FetchedResult including some timing statis…
-
with some normalisation e.g. to remove charset or variants
-
compile:
[echo] Compiling plugin: parse-s2jh
[javac] Compiling 7 source files to /home/lq/nutch-ajax/apache-nutch-2.3/build/parse-s2jh/classes
[javac] warning: [options] bootstrap class p…
LQZYC updated
8 years ago