iDigBio / idb-backend

iDigBio server and backend code for data ingestion, media processing, record indexing, and data API.
GNU General Public License v3.0
7 stars 0 forks source link

Indexing compatibility with Elasticsearch 7.17.0 #210

Open danstoner opened 2 years ago

danstoner commented 2 years ago

With Elasticsearch started via:

docker run -p 9200:9200 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:7.17.0

When running pytest:

>       raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
E       RequestError: TransportError(400, u'illegal_argument_exception', u'Types cannot be provided in get mapping requests, unless include_type_name is set to true.')

../../.pyenv/versions/2.7.17/envs/venv/lib/python2.7/site-packages/elasticsearch/connection/base.py:125: RequestError

See: https://www.elastic.co/guide/en/elasticsearch/reference/7.17/removal-of-types.html

danstoner commented 2 years ago

Maybe: https://techoverflow.net/2019/05/02/how-to-fix-elasticsearch-types-cannot-be-provided-in-put-mapping-requests-unless-the-include_type_name-parameter-is-set-to-true/

danstoner commented 2 years ago

Setting include_type_name=True in all put_mapping and get_mapping function calls does get past the above error.

We have to create the test index and PUT the mappings to run the test suite against an empty ES.

http PUT localhost:9200/idigbio-test
idb index full

Fixed some minor compatibility issues (like changing string to text in the mappings).

But finally we run headlong into the type issue trying to post the mappings:

RequestError: RequestError(400, u'illegal_argument_exception', u'Rejecting mapping update to [idigbio-test] as the final mapping would have more than 1 type: [publishers, recordsets]')

Without the mappings we end up with the following failure while running the test suite:

 NotFoundError(404, u'type_missing_exception', u'type[[records]] missing')
danstoner commented 2 years ago

Started some work to test compatibility PR #212 ... but seems like making largescale changes in the Python 2 codebase is a bad idea... maybe this needs to happen after the Python3 conversion. Or we need some different strategy than having everything talk to the same Elasticsearch cluster / version.