hbz / mabxml-elasticsearch

Raw hbz union catalog data exposed via a web API
http://lobid.org/hbz01
3 stars 1 forks source link

Updates failing #58

Closed dr0i closed 1 year ago

dr0i commented 1 year ago

from the logs/application.log:

Transformation failed org.metafacture.framework.MetafactureException: java.io.EOFException

It's likely a corrupted update tar gz. Going to just skip this and take the next one.

dr0i commented 1 year ago

This is the file:

ls -ahl [...] 424371753 Aug 6 03:15 DE-605-aleph-update-marcxchange-20220805-20220806.tar.gz

dr0i commented 1 year ago

Kudos @blackwinter ! Because the updates were really big this week there was a race condition in the workflow, resulting in copying unfinished archives and thus the EOFException. Going to copy the complete files and restarting update.

dr0i commented 1 year ago

Works. Closing.

dr0i commented 1 year ago

@TobiasNx asked:

Is still a problem? http://lobid.org/hbz01/HT021429811 The MAB file is not accessable?

Investigating ...

dr0i commented 1 year ago

Not sure what caused the problem. I did:

 scp /data/DE-605/mabxml/update/DE-605-aleph-update-marcxchange-2022080[56789]* $servername:/data/DE-605/mabxml/updates/

i.e. all files in question, but somehow not all data was indexed :(

Going to index the files step by step:

dr0i commented 1 year ago

I have to admit, I don't know how this works: while the logs are still written to logs/processMabxml.sh.20220818-weywot.log with content

Transformed: dir=/data/DE-605/mabxml/updates, suffix=gz,

the file in /data/DE-605/mabxml/updates` was removed 20 minutes ago. Is it somehow cached? It wasn't moved somewhere obviously, i.e. somewhere in the tree of the root of the repo.

dr0i commented 1 year ago

However this works ;) ... copying DE-605-aleph-update-marcxchange-2022080[56789]* is missing DE-605-aleph-update-marcxchange-20220810-20220811.tar.gz , and this is the file where HT021429811 is part of. So I guess I just forgot to copy that file. Will do this now and look tomorrow.

dr0i commented 1 year ago

Ack that http://lobid.org/hbz01/HT021429811 is ok now. @TobiasNx please check.

TobiasNx commented 1 year ago

Ack that http://lobid.org/hbz01/HT021429811 is ok now. @TobiasNx please check.

+1 seems to work now.

dr0i commented 1 year ago

Closing.