hbz / lobid-resources

Transformation, web frontend, and API for the hbz catalog as LOD
http://lobid.org/resources
Eclipse Public License 2.0
8 stars 7 forks source link

Add sourceOrganization #1095

Closed dr0i closed 3 years ago

dr0i commented 3 years ago

Taken from https://github.com/hbz/lobid-resources/issues/1089#issuecomment-741625434 : sourceOrganisation is missing yet.

Question is: Where to get the information of the institution who modified or created the resource ? And how map this to ISIL ? In principal, this should be stored in MNG ("Management Information")

$$a = "created by" ; $$b = "Create date"; $$c = "updated by"; $$d = "Update date", $$g = "Originating system ID"

but these look atm not promising (see e.g. https://github.com/hbz/lobid-resources/pull/1094/commits/c57ef745c9de6853eae54c503557e5d7e3ef7b55#diff-93a0ac06390d531a7e79c7a83d77cde3b0016bf81a2376e32fe8c76c7ac3780eR142 where MNG a="import" and MNG c is missing for all the data)

As the data is still in beta this information is likely to be there at the end of the beta stadium, but then it would be nice to know which kind of identifier will be used and how to map it to get the ISIL.

acka47 commented 3 years ago

I'll add some background and reference to MAB-MARC mapping and to MARC21 docs to have a better informed discussion.

Background (MAB transformation)

We added sourceOrganization, provider & modifiedBy in #455. The mapped MAB2 fields are documented as follows:

070       IDENTIFIZIERUNGSMERKMALE DER BEARBEITENDEN INSTITUTION

          Indikator:
          blank = Kennzeichen der katalogisierenden Institution
          a     = Kennzeichen der liefernden Institution
          b     = Kennzeichen der korrigierenden Institution

The mapping is:

Transfer to MARC21

The corresponding fields in MARC21 are (according to Konkordanz MAB2 – MARC 21 from DNB, see also MARC21 docs):

In this example, it looks quite good with "SIgel" in the according fields:

https://github.com/hbz/lobid-resources/blob/46ed089516a1bcb36cab695f274a42d18e4651c0/src/test/resources/alma/HT005207972.xml#L32-L37

In this example, there are strange values in 040$a & 040$d and we will have to find out how to handle these ( we can work with the value "DNB" in 040$c, though, I guess):

https://github.com/hbz/lobid-resources/blob/46ed089516a1bcb36cab695f274a42d18e4651c0/src/test/resources/alma/HT012734833.xml#L48-L53

The third test file does not have 040 at all, though.

dr0i commented 3 years ago

Re strange values in 040$a: these are also strange in Aleph MabXml HT012734833.

acka47 commented 3 years ago

Re strange values in 040$a: these are also strange in Aleph MabXml HT012734833.

Until now, we have been handling those as if they were normal Sigel, see the JSON of HT012734833 (snippet):

{
   "describedBy":{
      "id":"http://lobid.org/resources/HT012734833",
      "type":[
         "BibliographicDescription"
      ],
      "modifiedBy":{
         "id":"http://lobid.org/organisations/DE-9001#!",
         "label":"lobid Organisation"
      },
      "provider":{
         "id":"http://lobid.org/organisations/DE-101#!",
         "label":"lobid Organisation"
      },
      "sourceOrganization":{
         "id":"http://lobid.org/organisations/DE-9000#!",
         "label":"lobid Organisation"
      }
   },
   "@context":"http://lobid.org/resources/context.jsonld",
   "id":"http://lobid.org/resources/HT012734833#!"
}
dr0i commented 3 years ago

Done with #1100 . Closing.