Closed leveryd closed 1 year ago
https://discuss.elastic.co/t/add-creation-not-update-time-of-doc-using-ingest/217515
pipeline是在数据写入前执行的,还是写入后执行?
可以尝试通过 painless script 解决:
output {
elasticsearch {
hosts => ["elasticsearch-master:9200"]
index => "web-service-index"
document_id => "%{input}"
scripted_upsert => true
action => "update"
script_lang => "painless"
script_type => "inline"
script => "if(ctx.op == 'create') ctx._source.first_time = params.event.get('timestamp');"
}
}
output {
elasticsearch {
hosts => ["elasticsearch-master:9200"]
index => "tls"
document_id => "%{host}_%{ip}_%{port}"
scripted_upsert => true
action => "update"
script_lang => "painless"
script_type => "inline"
script => "
if(ctx.op == 'create') {
ctx._source=params.event;
ctx._source.first_create_time = params.event.get('timestamp');
} else {
ctx._source = params.event;
ctx._source.last_update_time = params.event.get('timestamp');
}
"
}
在更新记录时,first_create_time 会被设置成空值
output {
elasticsearch {
hosts => ["elasticsearch-master:9200"]
index => "tls"
document_id => "%{host}_%{ip}_%{port}"
scripted_upsert => true
action => "update"
script_lang => "painless"
script_type => "inline"
script => "
if(ctx.op == 'create') {
ctx._source=params.event;
ctx._source.first_create_time = params.event.get('timestamp');
} else {
String old = ctx._source.get('first_create_time');
ctx._source = params.event;
ctx._source.last_update_time = params.event.get('timestamp');
ctx._source.first_create_time = old;
}
"
}
}
ctx.op ctx._source 等含义见 https://www.elastic.co/guide/en/elasticsearch/painless/master/painless-update-context.html
问题是什么?
web-service文档中需要有"首次创建时间"信息,但目前有以下问题。
logstash在向es写入数据时,如果document_id已经存在,就会更新记录。此时 timestamp 时间戳也会更新
为什么需要添加第一次创建时间?
根据"第一次创建时间",可以知道有哪些新增资产(新增攻击面)