gbv / cocoda

A web-based tool for creating mappings between knowledge organization systems.
https://coli-conc.gbv.de/cocoda/
MIT License
39 stars 5 forks source link

ixTheo uses all classifications / Important: Classes/Classification duplicates #295

Closed DennisTobola closed 5 years ago

DennisTobola commented 5 years ago

When i am searching "Gott" in the ixTheo classification the first result is: 11Q7561 (Amts)einführung, Weihe von Personen (in der Kirche), taken from ICONCLASS

When i search "Philosophie" the first result is: CD 1000-CD 1040 Allgemeine bzw. Weltgeschichte der Philosophie, taken from the RVK

When i searched for "Philosophie" and looked at the results of just the word "Philosophie" i got 8 hits, 1 was from the BK and the 7 others from the RVK, despite ixTheo having a class "V Philosophie".

DennisTobola commented 5 years ago

After some Testing i noticed, that ICONCLASS has the same Issues. Although until now i only found results from ICONCLASS and the RVK Search: "Soft"

  1. result: 41C73 andere nicht-alkoholische Getränke (Iconclass)
  2. result: ZY 2200-ZY 2209 Baseball, Softball (RVK)
  3. result: AP 88996 Elektronische (digitale) Verfahren, u.a. Softwarehandbücher (RVK)
  4. result: 41C732 Fruchtsaft (Iconclass)

When I searched "soft" in ixTheo, the same results came up! --> "Flugzeug" faces the same issues. I conclude, that some classifications not only share their classes, but are even exactly the same.

stefandesu commented 5 years ago

This is a bug in DANTE. We stumbled upon this already, but I thought it was fixed. Here's an example query:

https://api.dante.gbv.de/suggest?search=Philosophie&voc=http:%2F%2Fwww.ixtheo.de%2Fclassification%2F&limit=20&count=20&use=notation,label&language=de

The parameter voc doesn't seem to actually restrict the vocabulary. Also, this happens for /suggest and for /search, so switching to the latter wouldn't fix our problem (although we could easily filter all results that don't match with the selected concept scheme).

StiftungAusNachlass commented 5 years ago

It is fixed in dev-version: http://dev-api.dante.gbv.de/suggest?search=Philosophie&voc=http%3A%2F%2Fwww.ixtheo.de%2Fclassification%2F&limit=20&count=20&use=notation,label&language=de

You could also pass the bartoc-uri or any other vocabulary-identifier: http://dev-api.dante.gbv.de/suggest?search=Philosophie&voc=http://bartoc.org/en/node/18797&limit=4&cache=0

Can you check your testcases in dev-version? If it is fine, i will update the live-version

stefandesu commented 5 years ago

Thanks @StiftungAusNachlass, works on my local machine, so it should be fine. 👍

stefandesu commented 5 years ago

@StiftungAusNachlass When will this be pushed to the production instance of DANTE?

StiftungAusNachlass commented 5 years ago

the fix is pushed now :-)

stefandesu commented 5 years ago

Thanks!

stefandesu commented 5 years ago

@StiftungAusNachlass The same issue persists when using /search, could you apply the same fix there as well? Thanks!

StiftungAusNachlass commented 5 years ago

@stefandesu Can you give a not working example for /search?

I tried http://api.dante.gbv.de/search?query=P*&voc=http:%2F%2Fwww.ixtheo.de%2Fclassification%2F&limit=20&use=notation,label&language=de&properties=-&cache=0 and that seems to work fine.

stefandesu commented 5 years ago

It seems that there were old results still in the cache.