gbv / coli-ana

API to analyze DDC numbers
https://coli-conc.gbv.de/coli-ana/app/
MIT License
2 stars 0 forks source link

API: get atomic elements only #74

Closed nichtich closed 2 years ago

nichtich commented 2 years ago

There should be a parameter to limit memberList in JSKOS result format to atomic elements (those shown in bold in the table and included in PICA+). By now this is only computed from JSKOS in the client and in PICA format.

Current workaround to get atomic elements:

curl -s 'https://coli-conc.gbv.de/coli-ana/app/analyze?notation=666.444&format=picajson' \
  | jq -r '.[][6:-2][]' | grep [0-9]
stefandesu commented 2 years ago

I started implementing this, using isElemental from pica.js. However, I've noticed that the atomic elements using that function (= the atomic elements used in PICA format) are not the same as those that are shown bold in the interface. For 700.90440747471 for example, isElemental returns 700.904 as the first element, while in the interface 7 is shown bold. This is related to #57 and I thought this was a solved issue, but apparently it's not.

nichtich commented 2 years ago

For 700.90440747471 the 7 is shown bold as atomic and returned as first atomic element in PICA. The reason is isElemental does only part of the job, determing atomic elements. The code can be simplified when not interested in the specific PICA subfield code (const id = subfields[table]).

stefandesu commented 2 years ago

@nichtich Please check my implementation and adjust if necessary.

To be very honest, a lot of this is very confusing. It's probably a combination of a complex matter and a confusing implementation. I'm specifically talking about isElemental, baseNumberIndex, and baseNumberFromIndex. It would be much, much easier if isElemental worked like I had expected in the first place, i.e. it returns true for all elements we want to mark bold in the interface or return in the PICA records, and false for all other elements. Then baseNumberIndex, and baseNumberFromIndex would only be used internally and it would be much clearer (even though those methods are by themselves already very confusing). That's actually unrelated to this issue, but just something I have noticed during the implementation.