Closed fsteeg closed 4 weeks ago
Might be related to https://github.com/hbz/lobid-gnd/issues/363.
Hi, We link to lobid-gnd on a search interface (swisscollections.ch) and I got a question why the links for two GND records don't return results on lobid-gnd. Both have been added to the GND on 14.12.2023, as far as I know: https://services.dnb.de/oai/repository?verb=GetRecord&metadataPrefix=RDFxml&identifier=oai:dnb.de/authorities/1312495189 https://services.dnb.de/oai/repository?verb=GetRecord&metadataPrefix=RDFxml&identifier=oai:dnb.de/authorities/1312496002
Reading this issue, I assume that the records weren't added to lobid-gnd due to these update issues. So I just add this feedback here in case more examples help you fix the issue.
Best regards, Silvia
I got a question why the links for two GND records don't return results on lobid-gnd
I reindexed those two (underlying issue is still unresolved):
https://lobid.org/gnd/1312495189 https://lobid.org/gnd/1312496002
Couldn't find the underlying problem. Especially strange is that the automatic updates somtimes are smaller than the later manually invoked one. As @fsteeg mentioned this could be a temporarily network issue. There might also be a problem on the side of the provider. Because of this hardly debugable problem and also to cope with possible problem at provider side I suggest to do also a daily update in addendum to the hourly updates. This way we have should be more safe to get all the data. If agreed I will configure to also have a daily update. Or better ideas?
Because of this hardly debugable problem and also to cope with possible problem at provider side I suggest to do also a daily update in addendum to the hourly updates. This way we have should be more safe to get all the data. If agreed I will configure to also have a daily update.
+1 This sound like a good approach to me. Isn't it so that the number of reports has risen since we switched to hourly updates in November (#350)? The question is whether it is a good idea in the first place to have hourly updates if people can not rely on them being carried out reliably.
Why we have sometimes (as in "seldom") trouble to get the whole data hourly remains to be a puzzle. It could be interesting to ask dnb if they notice issues on their side re oaipmh service and data syncing.
(+1 for additional daily updates, I've approved #378)
Why we have sometimes (as in "seldom") trouble to get the whole data hourly remains to be a puzzle. It could be interesting to ask dnb if they notice issues on their side re oaipmh service and data syncing.
Could this in some way be related to the fact that the OAI-PMH interface expects UTC times, while the modification times in the data and the server use local time (see mail from J.R. on 2023-12-22)?
From German Wikipedia:
Addiert man eine Stunde zur UTC, erhält man die Mitteleuropäische Zeit (MEZ), die zeitweise in Deutschland, Österreich, der Schweiz und anderen mitteleuropäischen Staaten gilt. Für die im Sommer geltende Mitteleuropäische Sommerzeit (MESZ) sind zwei Stunden zu addieren.
So indeed: if we query what we think starts last hour to now (MEZ) we query in fact just now to next hour (UTC). Wondering why there was data at all. Going to fix it.
Should be fixed with #379 "from now on". I.e. I assume a complete reindexing is needed to catch up with all the possibly missing data @fsteeg ?
A new dump should be provided soon:
We received a mail yesterday about missing records that were created last week. Example: https://lobid.org/gnd/1319507522
Creation date (see MARC) is: 2024-02-15
We received a mail yesterday about missing records that were created last week.
The example (https://lobid.org/gnd/1319507522) now works and I sent out a mail response.
E.V. who sent the mail mentioned in https://github.com/hbz/lobid-gnd/issues/372#issuecomment-1960852456 followed up on it by providing more entries that are still missing. I went through them to see on which day they were created and found entries from the following days:
He closes the email with the notion that the list is not exhaustive and more entries are missing. As the impact of the missing updates is significantly downgrading the service we should not wait for a new full dump but reindex titles – probably best starting at 2023-11-10 as this is the date we have rescheduled the updates (see #350).
Reindexed updates since 2023-11-10
, here are some of the examples from the mail:
https://lobid.org/gnd/1317861825 https://lobid.org/gnd/1317650069 https://lobid.org/gnd/1317239962 https://lobid.org/gnd/1317238400 https://lobid.org/gnd/1317163133 https://lobid.org/gnd/1317151534 https://lobid.org/gnd/1316984184
+1 It's ok for me to close this issue now but we should monitor closely whether updates reliably come in .
Closing. Updates have been fine during the last weeks/months.
Via email feedback, original message on 12/22/23 12:38 by M.H.
New entry was missing in lobid-gnd:
https://services.dnb.de/oai/repository?verb=GetRecord&metadataPrefix=RDFxml&identifier=oai:dnb.de/authorities/1312101741
Latest update is now on
2023-12-27T11:12:51.000
, which is2023-12-27T10:12:51Z
in OAI-PMH, as clarified by DNB via email on 12/22/23, 17:13 by J.R.Fetching updates manually worked, the missing resource is now in lobid-gnd:
https://lobid.org/gnd/1312101741
However, the automatic update for that time span on the server is way too small:
Compared to the manual run for the same time span (
sol@quaoar3:~/git/lobid-gnd$ sbt "runMain apps.ConvertUpdates 2023-12-27T09:40:26Z 2023-12-27T10:40:25Z"
):Might have been temporary network issues, but at least we need better monitoring.