Open GoogleCodeExporter opened 8 years ago
same result with the correct wikipedia page:
http://ja.wikipedia.org/wiki/%E3%83%A1%E3%82%A4%E3%83%B3%E3%83%9A%E3%83%BC%E3%82
%B8
Original comment by esync...@googlemail.com
on 15 Feb 2012 at 4:58
okay, some steps further :
public void addIndexBackendOptions(Configuration conf)
{
LuceneWriter.addFieldOptions("lang", LuceneWriter.STORE.YES,
LuceneWriter.INDEX.UNTOKENIZED, conf);
}
There is no LuceneWriter any more in Nutch 1.4
Original comment by esync...@googlemail.com
on 16 Feb 2012 at 12:15
okay, iam reworking the plugin atm. But now i get a "no features in text"
exception. Do you provide a build.xml for your javafiles?
Original comment by esync...@googlemail.com
on 16 Feb 2012 at 4:46
Thanks, I didn't check nutch's news.
In reading some documents of nutch 1.4, It seems the current nutch leaves
indexer to solr.
http://wiki.apache.org/nutch/FrontPage#Tutorials
http://wiki.apache.org/nutch/bin/nutch%20solrindex
And Solr 3.5 has already bundled our library as language identifier, so I'm
afraid my plugin is already unnecessary...
> okay, iam reworking the plugin atm. But now i get a "no features in text"
exception.
The exception throws when the input text has no available features of specified
profiles (i.e. alphabet, kanji and so on).
Are there some page without body?
Original comment by nakatani.shuyo
on 17 Feb 2012 at 10:26
Are you still reworking the plugin? We're interested in using the plugin in
Nutch as opposed to Solr.
Thank.
Original comment by jamescch...@gmail.com
on 18 May 2012 at 5:29
Original issue reported on code.google.com by
esync...@googlemail.com
on 15 Feb 2012 at 4:39