Open lukas-vlcek opened 7 years ago
The problem is that the author gives a wrong/(outdated?) settings example in readme, so the settings and mappings are parsed incorrectly and have no affect to ES. If you follow the author's example, you will find your ES index is configured incorrectly like below:
curl -XGET 'http://localhost:9200/_all/_settings?pretty'
...
"wiki" : {
"settings" : {
"index" : {
"settings" : {
"index" : {
"analysis" : {
"analyzer" : {
...
it's absolutely wrong, so actually you should do like this:
java -DentityExpansionLimit=2147480000 -DtotalEntitySizeLimit=2147480000 -Djdk.xml.totalEntitySizeLimit=2147480000 -Xmx2g -jar stream2es wiki --log debug --source 'enwiki-20170401-pages-articles.xml.bz2' --settings '
{
"number_of_shards" : 1,
"analysis" : {
"analyzer" : {
"default":{
"type" : "snowball",
"language" : "English"
}
}
}
}'
Those strange JVM opts is for another code issue issues 65. Hope it can help you.
It seems that the
--settings
option is not applied. The following is repro script for the wiki use case.Relevant server log: