gbv / subjects-api

JSKOS Concept Occurrences Provider implementation
https://coli-conc.gbv.de/subjects/
MIT License
0 stars 0 forks source link

Sometimes, invalid co-occurrences are returned #8

Closed stefandesu closed 2 years ago

stefandesu commented 6 years ago

Example: http://coli-conc.gbv.de/occurrences/api/?member=http://rvk.uni-regensburg.de/nt/V&scheme=*&threshold=2

So find all co-occurrences for RVK notation V.

One of the results is the following concept:

{
  "inScheme": [
    {
      "uri": "http://bartoc.org/en/node/533"
    }
  ],
  "notation": [
    "V Cc"
  ],
  "uri": "http://rvk.uni-regensburg.de/nt/V_Cc"
}

Obviously, RVK notation V Cc does not exist. But the returned catalogue URL also returns no results: https://gso.gbv.de/DB=2.1/CMD?ACT=SRCHA&IKT=1016&SRT=YOP&TRM=rvk%20%22V%22%20rvk%20%22V%20Cc%22

So there's clearly something going wrong there.

Edit: Another example: http://coli-conc.gbv.de/occurrences/api/?member=http:%2F%2Furi.gbv.de%2Fterminology%2Fbk%2F58.03&scheme=*&threshold=5

It includes the invalid notation DDC 660/.2815 (which again returns no matches in the catalogue URL).

stefandesu commented 6 years ago

Another (slightly different) example:

This request returns a co-occurrence from RVK concept B with DDC concept B (which does not exist). The catalogue URL shows the same number of matches (185), but none of them include a DDC notation B, only the notation 830 which is also listed separately in the co-occurrences.

I'm guessing that this is a combination of weirdly formatted (or plain wrong) data in the catalogue and some issue in the occurrences-api.