hbz / lobid-resources

Transformation, web frontend, and API for the hbz catalog as LOD
Eclipse Public License 2.0
7 stars 7 forks source link

Properly model MBD, POR, H52, ITM, etc. #1373

Closed TobiasNx closed 1 year ago

TobiasNx commented 2 years ago

At the moment these are all modeled as hasItem but they are not all Items therefore we should remodel these. At first we need better documentation what the elements represent. #1363

TobiasNx commented 1 year ago

@hagbeck and @UBmakla : at the moment we dump all holding and portfolio related ALMA-hbz-specific elements into hasItem differentiated by a messy type-Value:

But we intend to clean up und remodel the holding/portfolio data better. Could you help us to specify which elements you as prime customer no1 need, for which purpose and what they entail ? (I also add @blackwinter , @dr0i and @acka47 because this might be interesting for them.)

hagbeck commented 1 year ago

Hi, at first: the link to the wiki page result in "page not found". The dump of all these "items" was a workaround because of metafacture-morph limitations. For me it would make sense to rethink the goals of the data and try to make it better with metafacture-fix. We suggest to discuss this in a video call, perhaps on November 11th?

blackwinter commented 1 year ago

the link to the wiki page result in "page not found".

It's in the internal GOAL wiki space; not sure who's supposed to have access.

TobiasNx commented 1 year ago

First step as discussed with @blackwinter @hagbeck @UBmakla @dr0i :

TobiasNx commented 1 year ago

First step is done. All but ITM should not be listed as hasItem anymore. This is merged and should be active with the next full integration on monday.

Next step would be modelling POR:

Elektronische Sammlung meint eine Sammlung elektronischer Ressourcen, die Portfolios oder Datenbanken enthalten.

Elektronische Portfolios sind einzelne Titel einer elektronischen Sammlung oder individuelle Einzeltitel (E ‐ Book). Portfolio bezeichnet die bestimmte Erfassung, die Services und die Verknüpfungsinformationen zu einem bestimmten elektronischen Titel innerhalb einer elektronischen Sammlung. Kann administrative/Zugriffsinformationen enthalten.

Unter Service versteht man Dienste, die eine elektronische Sammlung für die dazugehörigen Portfolios anbietet, wie Volltext. Service beinhaltet Link‐Informationen für den Zugang zum Volltext.

Diese Begriffe sind als Ebenenzu verstehen: die Ebene„Elektronische Sammlung“,dieEbene „Service“ und die Ebene „Elektronisches Portfolio.

Elements are specified as: Electronic Portfolio Information (POR)

a   Portfolio PID   
b   Activation Status   
c   URL Type subfield   
d   Access URL subfield Downstream: NZ-Member-Code 49HBZ_NETWORK durch jeweiligen IZ-Member-Code ersetzen
e   Static URL  [DigiBib-Ticket#2022040150295633](https://digiticket.hbz-nrw.de/otrs/index.pl?Action=AgentTicketZoom&TicketID=295249)
f   Electronic Material Type subfield   
g   Library subfield    
h   Proxy Selected subfield 
i   Proxy Enabled subfield  
j   Interface Name subfield 
k   Authentication Note subfield    
l   Public Note subfield    
m   Portfolio/Service Internal Description subfield 
n   Coverage Statement subfield 
o   CZ Collection Identifier subfield   
p   Collection ID subfield  
q   Collection Name subfield    
B   Collection Internal Description subfield

r   License Code subfield   
s   License Name subfield   
t   PO Line subfield    
u   Additional PO Line subfield 
v   Created by subfield 
w   Create date subfield    
x   Updated by subfield 
y   Update date subfield    
z   Activation date subfield    
D   Direct Link subfield    Downstream: NZ-Member-Code 49HBZ_NETWORK durch jeweiligen IZ-Member-Code ersetzen
M   Member code subfield    
A   Available for Institution subfield  
S   Service ID subfield

Our old POR integration worked like this:

# do list(path:"POR  ", "var": "$i")
# # entity for every POR  .a without POR  .A
#   unless any_match("$i.a",".*6441$") # filter out hbz
#     copy_field("$i.a", "hasItem[].$append.id")
#     prepend("hasItem[].$last.id","https://lobid.org/item/")
#     set_array("hasItem[].$last.type[]", "Item", "POR")
#     add_field("hasItem[].$last.label", "POR")
#     copy_field("$i.D", "hasItem[].$last.electronicLocator")
#     copy_field("$i.d", "hasItem[].$last.sublocation")
#     copy_field("$i.a", "hasItem[].$last.heldBy.id")
#     replace_all("hasItem[].$last.heldBy.id",".*(\\d{4})$","$1")
#     lookup("hasItem[].$last.heldBy.id", "alma-institution-code-to-isil")
#     prepend("hasItem[].$last.heldBy.id", "http://lobid.org/organisations/")
#     append("hasItem[].$last.heldBy.id","#!")
#   end
#   # entity for every POR  .A
#   if exists ("$i.A")
#     do list(path:"$i.A", "var": "$j")
#       copy_field("$i.a", "hasItem[].$append.id")
#       prepend("hasItem[].$last.id","https://lobid.org/item/")
#       set_array("hasItem[].$last.type[]", "Item", "POR")
#       add_field("hasItem[].$last.label", "POR")
#       copy_field("$i.D", "hasItem[].$last.electronicLocator")
#       copy_field("$i.d", "hasItem[].$last.sublocation")
#       copy_field("$j", "hasItem[].$last.heldBy.id")
#       lookup("hasItem[].$last.heldBy.id", "alma-iz-code-to-isil")
#       prepend("hasItem[].$last.heldBy.id", "http://lobid.org/organisations/")
#       append("hasItem[].$last.heldBy.id","#!")
#     end
#   end
# end

We used the subfields a (Portfolio PID) and A(Available for Institution subfield) for institutional refrence and id creation. d (Access URL subfield) for sublocation. D (Direct Link subfield) for electronicLocator.

TobiasNx commented 1 year ago

In 2021 @acka47 and @dr0i suggested how to model to put Portfolio-Info into hasItem but differentiate the type by introducing a specific value: DigitalDocument: https://github.com/hbz/lobid-resources/issues/1177#issuecomment-809173419

Item in Bibframe is defined as: Single example of an Instance. And an Instance is defined as: Resource reflecting an individual, material embodiment of a Work. Instances can be: Print Archival Tactile Electronic

Also Item can have the Property electronicLocator: https://id.loc.gov/ontologies/bibframe-category.html . Therefore I would argue that to subsume POR information under hasItem is valid. But we would need something to differentiate between electronic and physical items as @acka47 suggested.

I would suggest that we introduce an additional type class for physical and electronical items: @acka47 suggested DigitalDocument for electronic items and I suggest PhysicalObject or PhysicalRessource

We also would need to model the id differently.

acka47 commented 1 year ago

@TobiasNx please report the latest developments with this ticket.

TobiasNx commented 1 year ago

We decided and implemented:

ITM = Bestandsresourcen HOL without ITM = Physikalischer Titel from ALEPH MBD without POR or HOL = Nur Titel from ALEPH POR= Elektronischer Titel new Elecotronic Portfolio

We need a better name for "NurTitel" and "PhysikalischerTitel" from ALEPH as types. See #1699