EUDAT-B2SHARE / b2share

B2SHARE software for the EUDAT CDI services.
GNU General Public License v2.0
35 stars 32 forks source link

Elasticsearch field `country` in `file-download` mapping overlaps with other type #1840

Open defnull opened 3 years ago

defnull commented 3 years ago

For a freshly installed instance (following the docker-compose based deployment documentation) elasticsearch logs the following error messages multiple times a second:

elasticsearch_1  | [2021-03-03 14:47:26,337][WARN ][cluster.action.shard     ] [Holly] [events-stats-file-download-2021-03][4] received shard failed for target shard [[events-stats-file-download-2021-03][4], node[bqEa69qYTwCwxaNwyyihOA], [P], v[1040], s[STARTED], a[id=FTNHGGHDRbWo81cnWYWr3A]], indexUUID [7L3TMWTRSNm5bxWsKpdBZQ], message [master [{Holly}{bqEa69qYTwCwxaNwyyihOA}{}{}] marked shard as started, but shard has not been created, mark shard as failed]
elasticsearch_1  | [2021-03-03 14:47:26,344][WARN ][indices.cluster          ] [Holly] [events-stats-file-download-2021-02] failed to add mapping [file-download], source [{"file-download":{"_all":{"enabled":false},"_source":{"enabled":false},"dynamic_templates":[{"date_fields":{"mapping":{"format":"strict_date_hour_minute_second","type":"date"},"match_mapping_type":"date"}}],"date_detection":false,"properties":{"bucket_id":{"type":"string","index":"not_analyzed"},"collection":{"type":"string","index":"not_analyzed"},"country":{"type":"string","index":"not_analyzed"},"file_id":{"type":"string","index":"not_analyzed"},"file_key":{"type":"string","index":"not_analyzed"},"timestamp":{"type":"date","format":"strict_date_hour_minute_second"},"unique_id":{"type":"string","index":"not_analyzed"},"visitor_id":{"type":"string","index":"not_analyzed"}}}}]
elasticsearch_1  | java.lang.IllegalArgumentException: Field [country] is defined as a field in mapping [file-download] but this name is already used for an object in other types

I'm not sure where this comes from (b2share or invenio) but I guess this is triggered by the invenio_stats.tasks.process_events celery task. Any ideas?

defnull commented 3 years ago

Hmm strange. The events-stats-file-download-* index contains a stats-file-download type with a complex country field mapping (containing geoname_id, iso_code and names sub-fields). I cannot find where this mapping is created or defined.

This conflicts with an index template called file-download-v1 which is installed by invenio-stats into the ES cluster. That template defines country as a simple string. Two types with conflict mappings cannot live in the same index, thus the error form ES.