gbv / coli-ana

API to analyze DDC numbers
https://coli-conc.gbv.de/coli-ana/app/
MIT License
2 stars 0 forks source link

PICA and MARC require reduction of analysis via class hierarchy #17

Closed nichtich closed 3 years ago

nichtich commented 3 years ago

PICA and MARC format only contain full numbers (the DDC classes highlighted in bold in the interface). Reduction of analysis to this short form requires knowledge about class hierarchy.

A) Either drop PICA and MARC export format from the API and move conversion into the client.

B) Or add lookup of class hierarchy into the backend. This would slow down queries and add complexity to the server but also allow to spot errors on import.

C) or both: support reduction via hierarchy in coli-ana server but move conversion to PICA and MARC into the client

If we choose to support reduction via class hierarchy in the server, it requires connection to a JSKOS server instance. To speed up queries, an additional database table with fields parent_uri => child_uri would help.

stefandesu commented 3 years ago

To be honest, I don't think I fully understand the issue, but I have two things that might be relevant to it:

  1. In Decompose.vue, I recently added a piece of code to determine class hierarchy without loading details for each DDC class. We need to check whether this always works, but I don't see why it shouldn't. See https://github.com/gbv/coli-ana/blob/8dde9779e3c923c720d98e0735ab1a9ce17ef447/src/components/Decompose.vue#L203-L216

  2. Maybe it's possible to utilize jskos-server's /ancestors endpoint to optimize loading the class details. For example, you could start from the back of the analysis and call the /ancestors endpoint with that notation and see which of the previous classes are included in the result. Repeat it until we have all the data.

nichtich commented 3 years ago

I see the additional notation provided as part of analysis (notation[1]) implicitly contains hierarchy information. This should be good enough to start with. We could add the broader field during conversion, so hierarchy information will be part of the database.

stefandesu commented 3 years ago

We could add the broader field during conversion, so hierarchy information will be part of the database.

Good idea!