opensearch-project / data-prepper

OpenSearch Data Prepper is a component of the OpenSearch project that accepts, filters, transforms, enriches, and routes data at scale.
https://opensearch.org/docs/latest/clients/data-prepper/index/
Apache License 2.0
263 stars 198 forks source link

Concatenation for geo fields in Data Prepper? #4916

Closed Grumpyfish1200 closed 1 month ago

Grumpyfish1200 commented 1 month ago

Is your feature request related to a problem? Please describe.

I implemented a geoip processor in my AWS OpenSearch Domain Data Prepper Pipeline for VPC flow logs as seen here:

and it works great, but the latitude and longitude fields are separated and not together in one field like [lat, long].

Screenshot 2024-09-05 at 1 12 58 PM

I tried looking to see if there is a way to concatenate these into one field and then convert the field data type to geo_point so I can use it in map visualizations, but I have not been able to find anything. I tried using an index template to convert the latitude and longitude fields into geo_point data types, but when I do this the logs with that information disappear and I do not see them anywhere. Not sure if I just did something wrong or because, since they are separated, it struggles to convert to the geo_point data type and the data disappears. For the record, when I use index templates to, say, convert a field to a string it works fine, it's just with the geo_point data type when something goes wrong.

I saw that some other users were having this issue too, the links are on Logstash not Data Prepper but still a similar issue: https://stackoverflow.com/questions/72539041/opensearch-on-aws-does-not-recognise-geoips-location-as-geojson-type https://github.com/opensearch-project/OpenSearch/issues/3546

Describe the solution you'd like

The ability to concatenate fields would be cool. Either with a processor in Data Prepper or in Opensearch itself. I want just one location field, something like location:[lat,long] that I can convert to the geo_point format so I can use it in the map visualizations.

Describe alternatives you've considered (Optional) I tried using an index template to converting the lat and long fields to a geo_point and seeing if I could add them to a map visualization incrementally and get something to work with, but the log data disappeared after the index template was implemented. I tried seeing if any other processor in Data Prepper could work and I could find anything. I also tried seeing if I could chain the Data Prepper pipeline to an ingest pipeline and use any of those processors, but I couldn't figure out how to do this.

Additional context Any help is appreciated, I am a novice and new to Opensearch so if there is a solution I am not seeing here sorry but thank you in advance!

dlvenable commented 1 month ago

@Grumpyfish1200 , Thank you for opening this issue.

The location field is ready to use as an OpenSearch geo_point type.

A mapping such as the following one should work:

{
  "template": {
    "mappings": {
      "dstlocation": {
        "properties": {
          "location": {
            "type": "geo_point"
          }
        }
      }
    }
  }
}

Also, I have created a sample you can use as a demo of how the geoip works with OpenSearch maps.

https://github.com/dlvenable/data-prepper-samples/blob/master/samples/geoip-sample/README.md

I configure the mapping to use geo_point. But, first, I do copy the location field to the root of the event which is slightly different from your model.

dlvenable commented 1 month ago

@Grumpyfish1200 , I want to check that the above solution worked for you. Would you be good closing this issue?

Grumpyfish1200 commented 1 month ago

@Grumpyfish1200 , I want to check that the above solution worked for you. Would you be good closing this issue?

Yes it worked perfectly, thank you!