Closed rustyx closed 6 years ago
Elasticsearch considers the dot in field names to be an object path. So you have to make sure you are not sending your non-object fields with dots. You can do so using the RenameTagger in the Importer configuration section, or even simpler, you can tell the Elasticsearch Committer to replace the dots in field names with whatever value before sending your documents.
<dotReplacement>_</dotReplacement>
I suggest you have a look at the Committer configuration page for more options. You may be interested in other settings such as jsonFieldsPattern
and fixBadIds
.
Yes there are various workarounds, but wouldn't it be easier to just use a different field name for probabilities? For example, document.languages
.
It is also desirable to have the probabilities in a single field.
My current workaround looks like this:
<tagger class="${handler}.tagger.impl.MergeTagger">
<merge toField="document.languages" singleValue="true" singleValueSeparator=",">
<fromFields>document.language.1.tag,document.language.1.probability,
document.language.2.tag,document.language.2.probability,
document.language.3.tag,document.language.3.probability</fromFields>
</merge>
</tagger>
<tagger class="${handler}.tagger.impl.DeleteTagger">
<fromFieldsRegex>document\.language\..*</fromFieldsRegex>
</tagger>
I will close this for now since I have a workaround.
With
LanguageTagger
'skeepProbabilities="true"
I'm unable to successfully index document into ElasticSearch. Is there a way to do it? The mapping fordocument.language
can be eithertext
orobject
, not both. How to configure ES index to accept text values fordocument.language
anddocument.language.1.probability
?Right now I'm getting:
"error": { "type": "mapper_parsing_exception", "reason": "Could not dynamically add mapping for field [document.language.1.probability]. Existing mapping for [document.language] must be of type object but found [text]." }