hbz / lobid-resources

Transformation, web frontend, and API for the hbz catalog as LOD
http://lobid.org/resources
Eclipse Public License 2.0
7 stars 7 forks source link

Missing FRL inCollection entries #781

Closed ChristophEwertowski closed 5 years ago

ChristophEwertowski commented 6 years ago

FRL resources in lobid: 5270 (query) FRL resources in publisso according to empty search: 7219

Resources with inCollection statement and an publisso label in hasVersion.label (MAB 655): 4045 (query). Resources with inCollection statement but no publisso label in hasVersion.label: 1225 (query). Resources without right inCollection statement but with publisso label in hasVersion.label: 42 (query) Resources with right inCollection statement and publisso label in fulltextOnline.label: 5 (query) Resource without inCollection publisso statement and publisso label in hasVersion: http://lobid.org/resources/HT019181830.

Resources with inCollection statement and some hasVersion.id: 4136 (query)

@jschnasse : Can you hand me a list of hbzIds so that I can check what's in the missing resources (or what's missing in it)? Apparently there are too many resources without a publisso URL to quickly switch to only an URL approach. Also I think it would be good to update the publisso index. For example, there aren't any publisso resources with type Sonstige in lobid anymore.

jschnasse commented 6 years ago

Hi @ChristophEwertowski, I will send you the requested list via email. Just one thing in before: we have 2078 (query) resources (Articles) in publisso that aren't supposed to be listed in our catalogue (yet). To filter out catalogued objects (5141), use this query

jschnasse commented 6 years ago

I just sent you the list. Are you able to filter out the 129 missing (in publisso) resources?

ChristophEwertowski commented 6 years ago

@jschnasse : I went through the diffs to recognise the resources which are missing. 72 resources are missing in publisso, 8 resources are missing in lobid. If you contact the ZDMed you can ask them to add "38 M: ellinet" (or at least "ellinet") to MAB 078r1.a to the resources which are missing in lobid and for the special cases. Since it's an entry from the ZBMed (Sigel 38 M) in all other cases I won't edit the MAB field in Aleph. If not: Eight missing resources aren't that bad.

lobid resources marked as publisso resources not in publisso If I cut Series, Periodicals and MultiVolumeWorks the following resources are left:

Group 1 Group 2 Group 3 Group 4
HT015363940 HT017115512 HT018125064 HT019048098
HT015363941 HT017173295 HT018180192 HT019088067
HT015363942 HT017181038 HT018241388 HT019088094
HT015364450 HT017213042 HT018308491 HT019121390
HT015372779 HT017230512 HT018312282 HT019232174
HT015372792 HT017387060 HT018312777 HT019264026
HT015372801 HT017396590 HT018312824 HT019371904
HT015372808 HT017562097 HT018312888 HT019432386
HT015372815 HT017566966 HT018312926 HT019606376
HT015372823 HT018318178 HT019618947
HT015411401 HT018318436 HT019620285
HT015414392 HT018318478 HT019624121
HT015433936 HT018507603 HT019629200
HT015434028 HT018507679
HT015434148 HT018552054
HT015434302 HT018552119
HT015436263 HT018552146
HT015438136 HT018552164
HT015438251 HT018552199
HT015439104 HT018555838
HT015456697 HT018703252
HT018712107
HT018723760
HT018855045
HT018856771
HT018932267
HT018995209
HT018995226

Publisso resources missing in lobid HT016324241, HT016786670 ~HT018242432~, HT018708539 HT019046526, HT019299296, ~HT019585233~ They all have a publisso URL in MAB 655e1.u but no entry "ellinet" in MAB 078r1.a.

Special cases HT018118640: Not in Publisso, not in lobid, although it has a publisso-URL in MAB 655e1.u. HT019181830: Doesn't seem to be subject of publisso but an information science resource.

jschnasse commented 6 years ago

The following HTs are still open for further investigation. For all others correct DOI links were established.

  1. ~HT019048098~
  2. ~HT019088067~
  3. HT019088094
  4. HT019121390
  5. ~HT019264026~
  6. ~HT019371904~
  7. HT019432386
  8. ~HT019606376~
  9. HT019620285
  10. HT019624121
  11. HT019629200
  12. ~HT018125064~
  13. HT018507603
  14. HT018507679
  15. HT018703252
  16. HT018712107
  17. HT018723760
  18. HT018855045
  19. HT018856771
  20. ~HT018932267~
  21. ~HT018995209~
  22. ~HT018995226~
  23. HT017115512
  24. ~HT017181038~
  25. ~HT017213042~
  26. HT017230512
  27. ~HT017387060~
  28. ~HT017396590~
  29. ~HT017562097~
  30. ~HT017566966~
  31. HT015364450
  32. HT015456697
  33. ~HT019181830~
  34. ~HT018118640~
acka47 commented 5 years ago

@jschnasse, is this still an issue? I went through the list from your previous comment and striked through all resources that either do not exist anymore or are – according to lobid-resources – not included (with inCollection) in publisso.

jschnasse commented 5 years ago

Kannste zu machen. Der Kram ist Teil der Jahresinventur. Haben wir dieses Jahr noch nicht gemacht. Dann kommt eine neue Liste.

acka47 commented 5 years ago

Closing.