datacleaner / extension_elasticsearch

DataCleaner extension for ElasticSearch
GNU Lesser General Public License v3.0
3 stars 4 forks source link

index name and document name error handling #3

Open Qwin opened 10 years ago

Qwin commented 10 years ago

When capitalizing the index name will give the following exception :

org.elasticsearch.indices.InvalidIndexNameException: [MYCAPITALINDEXNAME] Invalid index name [MYCAPITALINDEXNAME], must be lowercase
    at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService.validateIndexName(MetaDataCreateIndexService.java:167)
    at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService.validate(MetaDataCreateIndexService.java:465)
    at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService.access$100(MetaDataCreateIndexService.java:85)
    at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$2.execute(MetaDataCreateIndexService.java:215)
    at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:308)
    at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:134)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
    at java.lang.Thread.run(Thread.java:680)

Also the same goes for special characters :

[2014-04-02 12:58:16,146][DEBUG][action.admin.indices.create] [Acrobat] [miauw@#$] failed to create
org.elasticsearch.indices.InvalidIndexNameException: [miauw@#$] Invalid index name [miauw@#$], must not contain '#'
    at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService.validateIndexName(MetaDataCreateIndexService.java:161)
    at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService.validate(MetaDataCreateIndexService.java:465)
    at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService.access$100(MetaDataCreateIndexService.java:85)
    at org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$2.execute(MetaDataCreateIndexService.java:215)
    at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:308)

Well this is a small issue but could be very confusing for the user considering this error only shows in the elasticsearch console and not in datacleaner. There should be at least an error handling for that in datacleaner.

Kind regards, Robert

kaspersorensen commented 10 years ago

Agreed. I was not aware of these index name restrictions.

Probably we can simply create a @Validate method in the indexing analyzer.