elastic / connectors

Source code for all Elastic connectors, developed by the Search team at Elastic, and home of our Python connector development framework
https://www.elastic.co/guide/en/enterprise-search/master/index.html
Other
66 stars 121 forks source link

Enhance PostgreSQL connector service to dynamically map POSTGIS geography(Point) to Elasticsearch "Geopoint" type #2183

Open chantzlarge opened 5 months ago

chantzlarge commented 5 months ago

Problem Description

The PostgreSQL connector cannot automatically map the POSTGIS geography(Point) type to the Elasticsearch Geopoint field. This issue prevents efficient use of geospatial data in Elasticsearch, causing frustration in applications that rely on spatial queries.

Proposed Solution

Enhance the PostgreSQL connector to automatically detect and map the POSTGIS geography(Point) type to Elasticsearch's Geopoint field. This feature should be configurable and include custom mapping options for flexibility.

Alternatives

  1. Develop a preprocessing script to convert data types before ingestion.
  2. Use third-party data integration tools for more complex mappings, albeit at a potential cost and setup complexity.

Additional Context

Improving this mapping capability would streamline the integration of geospatial data, benefiting applications that depend on spatial analysis and search functionalities.

default index mapping:

"public_search_indices_coordinates": {
        "type": "text",
        "fields": {
          "delimiter": {
            "type": "text",
            "index_options": "freqs",
            "analyzer": "iq_text_delimiter"
          },
          "enum": {
            "type": "keyword",
            "ignore_above": 2048
          },
          "joined": {
            "type": "text",
            "index_options": "freqs",
            "analyzer": "i_text_bigram",
            "search_analyzer": "q_text_bigram"
          },
          "prefix": {
            "type": "text",
            "index_options": "docs",
            "analyzer": "i_prefix",
            "search_analyzer": "q_prefix"
          },
          "stem": {
            "type": "text",
            "analyzer": "iq_text_stem"
          }
        },
        "index_options": "freqs",
        "analyzer": "iq_text_base"
      },

\d output for column:

 coordinates                                    | geography(Point,4326)       |           | not null | 
chantzlarge commented 5 months ago

I was able to resolve this by updating the connector filtering options with the Elasticsearch Connector API to use an advanced query. The advanced query used SELECT st_geohash(...) to change the type from geography(point, ...) to a geohash before indexing.

seanstory commented 5 months ago

Thanks for filing! I'm glad you found a workaround, but I'm going to re-open because I think this is a valuable enhancement request for us to keep track of.