CDLUC3 / ezid

CDLUC3 ezid
MIT License
11 stars 4 forks source link

Fix Object type (resource type) search in OpenSearch #633

Open sfisher opened 1 month ago

sfisher commented 1 month ago

OpenSearch UI doesn't search for "object type" correctly.

It's hard to determine with our subset of data since a lot of things don't have it, but you can add items manually after looking at the development (running off the database).

Add items you know should show up in opensearch like:

python manage.py shell

import impl.open_search_doc as open_search_doc
from ezidapp.models.identifier import Identifier
open_s = open_search_doc.OpenSearchDoc(identifier=Identifier.objects.get(identifier='ark:/28722/k2kh0pq17'))
my_dict = open_s.dict_for_identifier()
open_s.index_document()

Some IDs that should have resource types are these:

ark:/38305/f100003k
ark:/38305/f1028phz
ark:/38305/f10c4sqg
ark:/38305/f10p0x8w
ark:/38305/f11834m5

I see resource.type as something like "image" but I think the search may need 'searchable_resource_type` which appears in the search table.

sfisher commented 1 month ago

Jing and I looked into the code and it's confirmed that I need to add searchable_resource_type which is some character code which I believe can be derived from the identifier and kernel metadata (metadata) which is a parsed field from json and may be datacite.xml in json or something else. There are methods to handle it all in there somewhere.