hbz / lobid-organisations

Transformation, web frontend, and API for lobid-organisations
http://lobid.org/organisations
Eclipse Public License 2.0
13 stars 3 forks source link

462 sigel process new #472

Closed TobiasNx closed 1 year ago

TobiasNx commented 1 year ago

Resolves #462

I tried to setup to change to code so that the pica binary file is read instead of the sigil.xml

This PR does three things:

This process seems to have some error, since I do not know when to close the streams, if at all. Could you have a look and adjust the workflows since I have no Java skills.

I updated the testfile to a pica binary file with 6 entries, using the same ISILs that were used in the old metadata plus more. I updated the transformation-tests ~but the other play test still need some fixing~. Also the play tests are updated. (Edit: 10.08.23)


Old workflow (xml dump and lots of oai updates) needs 3 1/2 min to transform and index:

2023-08-08 15:43:05 +0200 [INFO] from application in
                main - Starting transformation, will write to '/tmp/lobid-organisations/enriched.prod.out.json' 
2023-08-08 15:46:39 +0200 [INFO] from play in
                main - Application started (Prod)
2023-08-08 15:46:39 +0200 [INFO] from play in
                main - Listening for HTTP on /0:0:0:0:0:0:0:0:9000

New process (pica binary dump + small number of oai updates) takes 1 min to transform and index:

2023-08-10 13:22:18 +0200 [INFO] from application in
                main - Starting transformation, will write to '/tmp/lobid-organisations/enriched.prod.out.json' 
2023-08-10 13:23:33 +0200 [INFO] from play in
                main - Application started (Prod)
2023-08-10 13:23:34 +0200 [INFO] from play in
                main - Listening for HTTP on /0:0:0:0:0:0:0:0:9000