CDLUC3 / ezid

CDLUC3 ezid
MIT License
11 stars 4 forks source link

"Identifier issues" report at bottom of Dashboard doesn't work from OpenSearch #635

Open sfisher opened 1 month ago

sfisher commented 1 month ago

This is a super confusing page and code and the data we have in OpenSearch made it hard to discover if it was working.

Items with "issues" have to do with "hasIssues" and "linkIsBroken" in the search table originally.

If you select "ALL" from the list then the "issues" section of the page always tells you that you have no issues. You have to select one of the groups or owners for issues to show up correctly. This is apparently how the page is designed and is how the "working" version does things against the database.

I eventually finally found some issues in the normal database development version by selecting an items such as "Bulyon1" from the list and it shows issues in normal development.

How to be sure the data should show these errors in our OpenSearch index by:

python manage.py shell

import impl.open_search_doc as open_search_doc
from ezidapp.models.identifier import Identifier
open_s = open_search_doc.OpenSearchDoc(identifier=Identifier.objects.get(identifier='ark:/28722/k2kh0pq17'))
my_dict = open_s.dict_for_identifier()
open_s.index_document()

I put in some "error" items manually from the bulyon1 group and they still don't show up.

Also the "crossref" errors may be similar. I put in these error items for crossref.

doi:10.5070/P20W2R
doi:10.5070/P2PG6B
doi:10.5070/D6110000
doi:10.5070/L2319067
doi:10.21425/F5FBG12712
doi:10.21425/F5FBG12711

Need to troubleshoot why errors are still not showing up.

sfisher commented 1 month ago

This was much easier to fix today when I came back to it and was a combination of a few things:

  1. Not using the keyword property for exact matches.
  2. Some missing logic in the "reasons" listing since it wasn't clear if we needed that logic when I first started.
  3. Incorrect method call that was missing the parameter (hit) but wasn't raising an error. I guess the logic was just checking that the method existed rather than calling it.

I spent hours and hours trying to meticulously trace through 3 or 4 files of code which was convoluted on Friday, but it turned out to be small differences in how search was handled and Python not raising errors.