hbz / lobid-gnd

UI and API to the Integrated Authority File (Gemeinsame Normdatei, GND)
http://lobid.org/gnd
Eclipse Public License 2.0
24 stars 5 forks source link

Missing EntityFacts index entries #334

Closed fsteeg closed 1 year ago

fsteeg commented 1 year ago

When creating a new entityfacts index, we get only 5 million entries [1] (previously 8 million [2]).

This resulted in missing depictions for about 200k entries, see e.g. #332. The current index has been rebuilt from the latest GND data, but using the last complete EntityFacts data (index entityfacts_20221107, now also aliased as entityfacts). For switching to the current EntityFacts dump, we'll have to investigate.

Internal links:

[1] http://weywot3.hbz-nrw.de:9200/entityfacts_20221201/_search [2] http://weywot3.hbz-nrw.de:9200/entityfacts_20221107/_search

fsteeg commented 1 year ago

We had a config issue on the server, where the indexing was still configured to use the old entityfacts filename (though I'm not sure how using the old file resulted in less entries, since the old index was complete). With the correct filename, and using today's entityfacts dump, the new index looks good, and I've aliased it as entityfacts-test [1].

Assigning @acka47 for review. Since we only use that index during transformation, this has no immediate effect, but using the latest entityfacts should now work on next full GND reindex. No commit since the checked-in config points to a test file, and the faulty config was only on the servers. So you can close this if this all makes (enough) sense.

Internal link:

[1] http://weywot3.hbz-nrw.de:9200/entityfacts-test/_search

acka47 commented 1 year ago

+1 Closing