elastic / elasticsearch

Free and Open, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
68.51k stars 24.33k forks source link

ESQL calculate distance and speed based on two geo points #108332

Open tylerperk opened 1 month ago

tylerperk commented 1 month ago

Description

For security use cases it it common to calculate the distance between two points (based on source IP addresses, typically) and the speed required to travel from one to the other. If the movement is "impossible" then that is a factor used to raise suspicion of malicious activity such as IP spoofing. This can be calculated in other query languages using multiple complex statements involving several math functions and magic numbers. At minimum we should make that possible but ideally we should encapsulate the math into a function that calculates this for you.

elasticsearchmachine commented 1 month ago

Pinging @elastic/es-analytical-engine (Team:Analytics)

craigtaverner commented 1 month ago

This is related to the scheduled work for ST_DISTANCE, which covers at least the distance calculation part. However calculating speed is a separate concern. At the simplest, this could be simply distance/duration, which does not require a new function, so could be considered complete once the ST_DISTANCE is done. However, there are two further considerations:

craigtaverner commented 1 month ago

To get this to work in ES|QL we would need to support inline stats. But it would be even more efficient to use some time-ordering, or event ordering approach and look at windowing functions. @alex-spies pointed out the SQL functions LEAD and LAG as a good approach to this. They also seem generally useful for event data, log data and the security use cases.