matthewfranglen / postgres-elasticsearch-fdw

Postgres to Elastic Search Foreign Data Wrapper
MIT License
111 stars 32 forks source link

Could it support the 'geo field'? #16

Open cmontxgt opened 3 years ago

cmontxgt commented 3 years ago

geometry field don‘t support?

matthewfranglen commented 3 years ago

Could you provide an elasticsearch schema and postgres table definition that you would expect to work together please?

cmontxgt commented 3 years ago

elastic version: 7.8.1, postgres version:12

es mapping:

{
    "mappings": {
        "properties": {
            "id": {
                "type": "keyword"
            },

            "wkt": {
                "type": "geo_shape "
            }
        }
    }
},

postgresql table:

CREATE FOREIGN TABLE public.es_farmland_test1 (
    id text NULL,
    wkt geometry NULL
)
SERVER multicorn_es
OPTIONS (
    host '192.168.0.25',
    port '9200',
    index 'farmland_index_test',
    rowid_column 'id',
    query_column '',
    score_column '',
    timeout '20',
    username 'elastic',
    password '123456'
);
INSERT INTO public.es_farmland_test1
    (wkt)
    (
        SELECT
            ffl.map_points
        FROM
            f_farm_land ffl
            where ffl.farm_id='4028809e76f03ff80176feadf85d0050'
    )
matthewfranglen commented 3 years ago

Thank you. I'll have a look.

matthewfranglen commented 2 years ago

I've now looked into this some 10 months later and I'm able to create a table that stores geometry data in Elastic Search and compares as equal to a table directly stored in Postgres. When using the inferred schema in Elastic Search the data is stored as text.

Schema and extension configuration:

CREATE EXTENSION postgis;
CREATE EXTENSION multicorn;

CREATE SERVER multicorn_es FOREIGN DATA WRAPPER multicorn
OPTIONS (
  wrapper 'pg_es_fdw.ElasticsearchFDW'
);

CREATE TABLE geometries
    (
        id BIGINT,
        name TEXT,
        geom GEOMETRY
    )
;

CREATE FOREIGN TABLE geometries_es
    (
        id BIGINT,
        name TEXT,
        geom GEOMETRY,
        query TEXT,
        sort TEXT,
        score NUMERIC
    )
SERVER multicorn_es
OPTIONS
    (
        host 'elasticsearch',
        port '9200',
        index 'geometry-index',
        type 'geometry',
        rowid_column 'id',
        query_column 'query',
        sort_column 'sort',
        score_column 'score',
        timeout '20'
    )
;

Geometry data (these are the example entries in the PostGIS geometries page):

--
-- PostgreSQL database dump
--

--
-- Data for Name: geometries; Type: TABLE DATA; Schema: public; Owner: postgres
--

COPY public.geometries (id, name, geom) FROM stdin;
1   Point   010100000000000000000000000000000000000000
2   Linestring  01020000000400000000000000000000000000000000000000000000000000F03F000000000000F03F0000000000000040000000000000F03F00000000000000400000000000000040
3   Polygon 0103000000010000000500000000000000000000000000000000000000000000000000F03F0000000000000000000000000000F03F000000000000F03F0000000000000000000000000000F03F00000000000000000000000000000000
4   PolygonWithHole 01030000000200000005000000000000000000000000000000000000000000000000002440000000000000000000000000000024400000000000002440000000000000000000000000000024400000000000000000000000000000000005000000000000000000F03F000000000000F03F000000000000F03F0000000000000040000000000000004000000000000000400000000000000040000000000000F03F000000000000F03F000000000000F03F
5   Collection  0107000000020000000101000000000000000000004000000000000000000103000000010000000500000000000000000000000000000000000000000000000000F03F0000000000000000000000000000F03F000000000000F03F0000000000000000000000000000F03F00000000000000000000000000000000
\.

--
-- PostgreSQL database dump complete
--

Copy to Elastic Search:

INSERT INTO geometries_es (id, name, geom) (SELECT id, name, geom FROM geometries);

Establish equivalence:

SELECT geometries.geom = geometries_es.geom FROM geometries JOIN geometries_es ON geometries.id = geometries_es.id;

This produces

 ?column?
──────────
 t
 t
 t
 t
 t
(5 rows)

The associated Elastic Search table has the content (queried with curl localhost:9200/geometry-index/geometry/_schema):

{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 5,
    "max_score": 1,
    "hits": [
      {
        "_index": "geometry-index",
        "_type": "geometry",
        "_id": "5",
        "_score": 1,
        "_source": {
          "geom": "0107000000020000000101000000000000000000004000000000000000000103000000010000000500000000000000000000000000000000000000000000000000F03F0000000000000000000000000000F03F000000000000F03F0000000000000000000000000000F03F00000000000000000000000000000000",
          "name": "Collection"
        }
      },
      {
        "_index": "geometry-index",
        "_type": "geometry",
        "_id": "2",
        "_score": 1,
        "_source": {
          "geom": "01020000000400000000000000000000000000000000000000000000000000F03F000000000000F03F0000000000000040000000000000F03F00000000000000400000000000000040",
          "name": "Linestring"
        }
      },
      {
        "_index": "geometry-index",
        "_type": "geometry",
        "_id": "4",
        "_score": 1,
        "_source": {
          "geom": "01030000000200000005000000000000000000000000000000000000000000000000002440000000000000000000000000000024400000000000002440000000000000000000000000000024400000000000000000000000000000000005000000000000000000F03F000000000000F03F000000000000F03F0000000000000040000000000000004000000000000000400000000000000040000000000000F03F000000000000F03F000000000000F03F",
          "name": "PolygonWithHole"
        }
      },
      {
        "_index": "geometry-index",
        "_type": "geometry",
        "_id": "1",
        "_score": 1,
        "_source": {
          "geom": "010100000000000000000000000000000000000000",
          "name": "Point"
        }
      },
      {
        "_index": "geometry-index",
        "_type": "geometry",
        "_id": "3",
        "_score": 1,
        "_source": {
          "geom": "0103000000010000000500000000000000000000000000000000000000000000000000F03F0000000000000000000000000000F03F000000000000F03F0000000000000000000000000000F03F00000000000000000000000000000000",
          "name": "Polygon"
        }
      }
    ]
  }
}

and the schema (queried with curl localhost:9200/geometry-index/geometry/_mapping):

{
  "geometry-index": {
    "mappings": {
      "geometry": {
        "properties": {
          "geom": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "name": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          }
        }
      }
    }
  }
}

I tested this with Postgres 9.4, PostGIS 2.5 and Elastic Search 5.6.16. The inferred schema is not geo_shape though so I'll have to look into that more. I expect that there needs to be a mapping between the different formats.

matthewfranglen commented 2 years ago

Trying again with Elastic Search 7, using the following schema:

{
  "mappings": {
    "properties": {
      "name": {
        "type": "text"
      },
      "geom": {
        "type": "geo_shape"
      }
    }
  }
}

When the Elastic Search table is empty querying it:

SELECT * FROM geometries_es;

Returns the error:

ERROR:  SEARCH for /geometry-index failed: RequestError(400, u'search_phase_execution_exception', u'No mapping found for [] in order to sort on') 

When trying to copy the results across:

INSERT INTO geometries_es (id, name, geom) (SELECT id, name, geom FROM geometries);

Returns the error:

ERROR:  INDEX for /geometry-index/1 and document {'sort': None, 'name': u'Point', 'score': None, 'geom': u'010100000000000000000000000000000000000000', 'query': None} failed: RequestError(400, u'mapper_parsing_exception', u'failed to parse field [geom] of type [geo_shape]')

This looks quite tricky to fix.