Closed jschnasse closed 8 years ago
API 1.x is not up to date due to the quaoar cluster issues (see https://github.com/hbz/nwbib/issues/302).
Data 2.0 is on a different machine, seems to be up to date.
I've indexed the updates that were missing due to the cluster issues:
http://lobid.org/resource?id=HT018925962&format=full http://lobid.org/resource?id=HT018925945&format=full
Same for sources:
http://lobid.org/resource?id=HT018925962&format=source http://lobid.org/resource?id=HT018925945&format=source
Other missing resources should be present too. Assigning to @jschnasse for review.
Some notes on what I did:
To restore these, I've indexed the updates since 2016-03-26 (when first nagios warnings came) from: http://index.hbz-nrw.de/alephxml/export/update/
1) in lodmill, locally checked out into /home/fsteeg/git/lodmill
:
cd /home/fsteeg/git/lodmill/lodmill-rd/doc/scripts/hbz01/
and download files here.
To process a single update:
bash -x startHbz01ToLobidResources.sh master /home/fsteeg/git/lodmill/lodmill-rd/doc/scripts/hbz01/DE-605-aleph-update-marcxchange-20160329-20160330.tar.gz lobid-resources NOALIAS quaoar2.hbz-nrw.de quaoar exact
To process multiple files and redirect output to log file:
bash -x startHbz01ToLobidResources.sh master dummy_ignore lobid-resources NOALIAS quaoar2.hbz-nrw.de quaoar exact doc/scripts/hbz01/updates.txt > 20160405-140410-master.log.startHbz01ToLobidResources.sh 2>&1
The updates.txt
file contains full paths to the files, as in the single file sample above.
2) in mabxml-elasticsearch, locally checked out into /home/fsteeg/git/mabxml-elasticsearch
:
cd /home/fsteeg/git/mabxml-elasticsearch/src/main/resources/input
and download files here.
In Transform.java, set DIR = "/home/fsteeg/git/mabxml-elasticsearch/src/main/resources/input"
(temporary: set esIndexer.setIndexname("hbz01-staging")
, see https://github.com/hbz/mabxml-elasticsearch/issues/21) and run Transform.java.
I came across entry HT001401787 where the JSON and the source don't describe the same title:
See http://lobid.org/resource?id=HT001401787&format=full vs. http://lobid.org/resource?id=HT001401787&format=source.
I don't know whether this has anything to do with this issue. If not, we need to open a new one.
Completely different titles? They (now) are both "Westfälische Bibliographie zur Geschichte, Landeskunde und Volkskunde". Could that have been a temporary issue? Or am I missing some detail?
Strange. I swear these were different titles yesterday. Obviously, this was a temporary issue then.
Not ok: http://lobid.org/resource/HT018925962/about http://lobid.org/resource/HT018925945/about
ok: http://lobid.org/resources/HT018925962 http://lobid.org/resources/HT018925945