-
Based on this page http://en.wikipedia.org/wiki/URL_normalization, I was wondering if should update BasicURLNormalizer to implements the first 2 sections.
-
- [ ] hdfs
- [ ] hbase
- [ ] cassandra
- [ ] build second index base on different backend (it's only necessary for hdfs I guess)
- [ ] test query speed
-
Analizar varios proyectos realizados que utilicen Docker, tomarlos como inspiración y ejemplo para la realización del mío.
-
I'm trying to use the plugin to build an image but I found a "bug" when using this kind of configuration:
```
${project.basedir}/src/app/
src/app/
```
The task fails with the following error…
-
Existen algunos proyectos basados en Nutch que buscan información espacial como [BCube](https://github.com/b-cube/nutch-crawler). Hay que localizar algunos y ver exactamente qué plugins están utilizan…
-
```
See "Order of precedence for group-member records" section at the end of
https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt
```
Original issue reported on code.google…
-
```
2015-10-29T15:38:07.623+0000 c.d.s.c.b.SimpleFetcherBolt [ERROR] Exception while fetching http://www.quanjing.com/search.aspx?q=top-651451||1|60|1|2||||&Fr=4
java.lang.IllegalArgumentException: Il…
-
Hi Folks,
I am using Jest 2.0.0 within my application. The code can be seen in the [following patch](https://github.com/lewismc/memex_es/blob/master/memex_cdr_plugin.patch) particularly within the [fo…
-
we should start to integrate NUTCH's new REST API:
https://wiki.apache.org/nutch/Nutch_1.X_RESTAPI
The same way that I did this with [Tika-Python](http://github.com/chrismattmann/tika-python/).
-
=== This issue was migrated from JIRA ===
Type: Bug
Priority: Major
Status: Open
Resolution: UNRESOLVED
Reported by: buckett
Assigned to: ROME Jira Lead
Created: Mon Apr 20 10:47:11 CEST 2009
Updated:…