mobz / elasticsearch-head

A web front end for an elastic search cluster
http://mobz.github.io/elasticsearch-head/
Other
9.41k stars 2.02k forks source link

[Index] docs meaning? #425

Closed freesinger closed 5 years ago

freesinger commented 5 years ago

I am using logstash export data into ES and I set document_id for not producing duplicate data. But I get confused by the docs number which shows docs: 3,697 (4,377) in head plugin. Since I set the unique key for docunment_id, should the docs numbers be the same? Any help would be appreciated.

philipskokoh commented 5 years ago

It is possible if you have deleted docs (that haven't been merged yet) The number comes from 'docs' in index stats API: https://www.elastic.co/guide/en/elasticsearch/reference/master/indices-stats.html

freesinger commented 5 years ago

When I didn't set document_id, the docs numbers always stay the same. However, logstash would keep exporting duplicated data until I manually stop it. I guess it may be the docs haven't been merged make the docs number different since there's no delete operation. I will check the export result when it's down, thank you!

philipskokoh commented 5 years ago

I think if you replace a same doc (index new data with existing doc_id), elasticsearch will keep both docs internally until merge process. That explains why the numbers are different.

freesinger commented 5 years ago

I found I've set a non-unique key as doc_id which may explain why the numbers are different. But after I set fingerprint as doc_id for every document. The number of docs will keep docs: 1 (xxxxx) , which really makes me confused. I've issued in logstash repo here which shows more detail.