JabRef / jabref

Graphical Java application for managing BibTeX and biblatex (.bib) databases
https://devdocs.jabref.org
MIT License
3.6k stars 2.57k forks source link

RDF export uses bad URI for namespace 'bibo' #8920

Open jalsti opened 2 years ago

jalsti commented 2 years ago

JabRef version

5.6 (latest release)

Operating system

GNU / Linux

Details on version and operating system

No response

Checked with the latest development build

Steps to reproduce the behaviour

  1. export bib list as RDF
  2. check RDF for validity against namespaces
  3. the namespace for 'bibo' gives a non-existent URI: "http://purl.org/ontology/biblio/"
  4. thinking it might be the case that "http://purl.org/ontology/bibo/" would be the correct one, I noticed that this is not the case, e.g. Contribution or position are not defined there
  5. so it might be more than just putting in a new URI

Appendix

head of export file

…generated by src/main/resources/resource/layout/bibordf.begin.layout

<?xml version="1.0"?>
<rdf:RDF
      xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
      xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
      xmlns:bibo="http://purl.org/ontology/biblio/"
      xmlns:dc="http://purl.org/dc/elements/1.1/"
      xmlns:dcterms="http://purl.org/dc/terms/"
      xmlns:foaf="http://xmlns.com/foaf/0.1/"
>
Siedlerchr commented 2 years ago

Thanks for the hint. The good news it that it's an easy fix, because the bibordf is based on the Custom Export fomat: https://github.com/JabRef/jabref/tree/main/src/main/resources/resource/layout

enrique9318 commented 2 years ago

Hi! can I work on this issue?

ThiloteE commented 2 years ago

sure :-)

As a general advice: check out https://github.com/JabRef/jabref/blob/main/CONTRIBUTING.md for a start. Also, https://devdocs.jabref.org/getting-into-the-code/guidelines-for-setting-up-a-local-workspace is worth having a look at. Feel free to ask if you have any questions here on GitHub or also at JabRef's Gitter chat.

Try to open a (draft) pull request early on, so that people can see you are working on the issue and so that they can see the direction the pull request is heading towards. This way, you will likely receive valuable feedback.

brauliorivas commented 1 year ago

Hi, I would like to work in this issue.

ThiloteE commented 1 year ago

Hello, I'm trying to solve the issue #8920. I don't understand how to reproduce the behaviour. I mean I export a bib list as RDF, but don't know how namespaces are checked, neither know how bibo gives a non-exsitent URI. I appreciate some guidance.

@brauliorivas http://purl.org/ontology/biblio/ is a non-existent URI. We need a new one.

I found some alternatives by conducting a search for bibo (https://purl.archive.org/domain_search?q=bibo)

Results:

Found other alternatives to biblio (https://purl.archive.org/domain_search?q=biblio):

If you go to https://github.com/JabRef/jabref/blob/bb011c9313367a28990ae213b3920fe6cd10d1dc/src/main/resources/resource/layout/bibordf.begin.layout you can remove the old one and add a new proper URI

I personally am not sure how to judge which one is the right one though, so maybe @jalsti could teach us a little bit more how namespaces are checked. Could you give us an example for a bib list you tried to export?

brauliorivas commented 1 year ago

Thank you so much. I checked the links. However there's something I still don't understand

e.g. Contribution or position are not defined there

What is @jalsti referring to?

Edit: Ohhh, I see where is Contribution and position is.

jalsti commented 1 year ago

RDF can be checked via W3C RDF validator: https://www.w3.org/RDF/Validator/

If I export the following .bib file:

@WWW{PrudhommeWWW,
  author    = {Eric Prud'hommeaux},
  title     = {W3C RDF Validation Service},
  date      = {2006-02-28},
  url       = {https://www.w3.org/RDF/Validator/},
  owner     = {jalsti},
  timestamp = {2023-03-07 17:33},
}

@Comment{jabref-meta: databaseType:biblatex;}

as "BibO RDF", it results in something like:

<?xml version="1.0"?>
<rdf:RDF
      xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
      xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
      xmlns:bibo="http://purl.org/ontology/biblio/"
      xmlns:dc="http://purl.org/dc/elements/1.1/"
      xmlns:dcterms="http://purl.org/dc/terms/"
      xmlns:foaf="http://xmlns.com/foaf/0.1/"
>
<bibo:Document rdf:about="">
    <dc:title>W3C RDF Validation Service</dc:title>
    <dc:date>2006</dc:date>
   <bibo:contribution>
  <bibo:Contribution>
    <bibo:role rdf:resource="http://purl.org/ontology/bibo/roles/author" />
    <bibo:contributor><foaf:Person foaf:name="Eric Prud&#39;hommeaux"/></bibo:contributor>
    <bibo:position>1</bibo:position>
  </bibo:Contribution>
</bibo:contribution>
</bibo:Document>
</rdf:RDF>

If you paste that into the validation service, every resulting URL in the predicate column should resolve (and point to a descriptive page where the predicated is documented).

As Jabref uses the wrong link (and I by myself can not guess what the original correct one would be), some predicates (http://purl.org/ontology/biblio/…) do not resolve, which is wrong. The original namespace should be found again, or a new one capable of reflecting .bib file standard should be used instead.

Hope that helps :-)

brauliorivas commented 1 year ago

Great, very useful. I've changed the namespace to the one you suggested

http://purl.org/ontology/bibo/

and using https://www.w3.org/RDF/Validator/ every predicate can be resolved. I also tried with the others URI's suggested by @ThiloteE and none of the worked expect for the already mentioned. Also in the RDF file, Contribution and position appear to be defined. For example:

<?xml version="1.0"?>
<rdf:RDF
      xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
      xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
      xmlns:bibo="http://purl.org/ontology/bibo/"
      xmlns:dc="http://purl.org/dc/elements/1.1/"
      xmlns:dcterms="http://purl.org/dc/terms/"
      xmlns:foaf="http://xmlns.com/foaf/0.1/"
>

<bibo:Document rdf:about="">

    <dc:title>W3C RDF Validation Service</dc:title>
    <dc:date>2006</dc:date>

   <bibo:contribution>
  <bibo:Contribution> --> Here
    <bibo:role rdf:resource="http://purl.org/ontology/bibo/roles/author" />
    <bibo:contributor><foaf:Person foaf:name="Eric Prud&#39;hommeaux"/></bibo:contributor>
    <bibo:position>1</bibo:position> --> Here
  </bibo:Contribution>
</bibo:contribution>

</bibo:Document>
</rdf:RDF>

Correct if I'm wrong, but I think that both are defined while using the correct namespace. I trying to understand the problem but I think that it should work by changing the URI. So should be that the correct output?

jalsti commented 1 year ago

As this should be OK for this small .bib file, I think, probably every possible standard field Bibtex provides (https://www.bibtex.com/format/ ?) for the different .bib entry types should be checked to be in the new bibo ontology you now linked, to be sure each possible export is valid.

brauliorivas commented 1 year ago

As this should be OK for this small .bib file, I think, probably every possible standard field Bibtex provides (https://www.bibtex.com/format/ ?) for the different .bib entry types should be checked to be in the new bibo ontology you now linked, to be sure each possible export is valid.

Right! So I used https://www.bibtex.com/format/ as suggested and compared it with http://purl.org/ontology/bibo/ to check the common fields. Most of them are defined in the new bibo ontology, but

So, the new bibo ontology doesn't meet the requirements. Do you think I should look for other alternatives? @ThiloteE what do you think?

koppor commented 1 year ago

Please step back and investigate the background on this issue. - The "Bibliographic Ontology" (https://en.wikipedia.org/wiki/Bibliographic_Ontology) is not maintained any more. @jalsti Please describe more your use case. Why do you need this export?

@jalsti described a "bug2 in the software, but did not describe the intended use case.

My input: This export should just been dropped.

A new exporter exporting Doubline Core should be created.

JabRef already implements a Dublin Core Export. The architecture is not that good. It has to be completely refactored.

/src/main/java/org/jabref/logic/xmp/DublinCoreExtractor.java#L422

    public void fillDublinCoreSchema() {

Steps:

  1. Add a new exporter following the usual Exporter flow. Starting point: /src/main/java/org/jabref/logic/exporter/ModsExporter.java#L58.
    1. Create the exporter
    2. Create a test class for the exporter
    3. Do test-driven development
  2. Check if the exporter is reachable in the GUI using "File > Export"

Side notes:

jalsti commented 1 year ago

Just stumbled upon this bug, as I did some course including RDF, so I tried the JabRef RDF export and saw it does not validate. As this is a format widely present in Semantic Web, which is a growing thing, it might be useful for others to keep it in, but use a different name space/ontology, rewriting the export where needed. For example the program Zotero uses also xmlns:bibo="http://purl.org/ontology/bibo/" where an export results in the following RDF (taking a minimal example)

<rdf:RDF
 xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 xmlns:res="http://purl.org/vocab/resourcelist/schema#"
 xmlns:z="http://www.zotero.org/namespaces/export#"
 xmlns:dcterms="http://purl.org/dc/terms/"
 xmlns:bibo="http://purl.org/ontology/bibo/"
 xmlns:foaf="http://xmlns.com/foaf/0.1/">
    <z:UserItem rdf:about="http://zotero.org/users/local/X0VW24Fd/items/3RCF7S8Z">
        <res:resource rdf:resource="https://ist.inrae.fr/wp-content/uploads/sites/21/2022/01/OpenClass_Decouvrir_JabRef_2022.pdf"/>
    </z:UserItem>
    <bibo:Manuscript rdf:about="https://ist.inrae.fr/wp-content/uploads/sites/21/2022/01/OpenClass_Decouvrir_JabRef_2022.pdf">
        <dcterms:title>OpenClass_Decouvrir_JabRef_2022.pdf</dcterms:title>
        <dcterms:language>fr</dcterms:language>
        <bibo:uri>https://ist.inrae.fr/wp-content/uploads/sites/21/2022/01/OpenClass_Decouvrir_JabRef_2022.pdf</bibo:uri>
        <dcterms:creator rdf:nodeID="n9"/>
        <bibo:authorList>
           <rdf:Seq><rdf:li rdf:nodeID="n9"/></rdf:Seq>
        </bibo:authorList>
    </bibo:Manuscript>
   <foaf:Person rdf:nodeID="n9"><foaf:surname>INRAE</foaf:surname></foaf:Person>
</rdf:RDF>
koppor commented 1 year ago

Thank you for providing an example!

dcterms is an indicator to DublinCore. According to Wikipedia, this is also RDF. I lean towards investigation on DublinCore and check if it matches your use case.

In case you need the unmaintained Bibo RDF, we of course welcome a pull request updating the code.

ackernkamp commented 7 months ago

Hi there! I wish to work on this issue, could I please be assigned to it? Thank you very much!

ThiloteE commented 7 months ago

Sure, go ahead :-) 👍

ackernkamp commented 6 months ago

After looking through both these discussions and the codebase a while longer, I do not believe this is quite as easy of a fix as is implied by earlier comments or the 'good first issue' label that was placed.

In the following text, it is implied a previously existing export should be updated and refactored to fit the new exporter scheme, and then tests can be added to fully implement this pre-existing code. However, after some more digging into both this file as well as the ModsExporter.java file provided as an example/starting point, I believe there is much more refactoring and rewriting of the original Dublin Core Extractor code to be done to properly implement this into an Exporter.

A new exporter exporting Doubline Core should be created. JabRef already implements a Dublin Core Export. The architecture is not that good. It has to be completely refactored.

/src/main/java/org/jabref/logic/xmp/DublinCoreExtractor.java#L422

    public void fillDublinCoreSchema() {

Steps:

  1. Add a new exporter following the usual Exporter flow. Starting point: /src/main/java/org/jabref/logic/exporter/ModsExporter.java#L58.
  2. Create the exporter
  3. Create a test class for the exporter
  4. Do test-driven development
  5. Check if the exporter is reachable in the GUI using "File > Export"

Rather than leave this issue without having contributed in any meaningful way, I would like to suggest the following questions be answered to make resolving this issue easier to understand and carry out for the next person who would like to have a look at it:

  1. What parts of DublinCoreExtractor.java need to be refactored, as opposed to an entirely new class (i.e. DublinCoreExporter.java) being created? What elements from this previous Dublin Core implementation are still usable in this exporter?
  2. Should this exporter still use the DublinCoreSchema class provided by Apache, or should a similar style to ModsExporter.java be used? Where, as entries and fields are iterated through, a StreamWriter is assembling the export.
  3. The "usual Exporter flow" mentioned in step 1 was not something I could personally find in documentation. This may have been a problem on my side of course, but if there really IS some place to read about this standard, it would be good to have it explicitly linked so that it can be used during development/fixing of the exporter.

And again, as mentioned earlier in this message, I believe it might be good to reconsider the 'good first issue' label, since the issue appears to include significant refactoring and new implementation work.

koppor commented 6 months ago

@ackernkamp Thank you for reaching out. I unassigned you, removed it from good first issues and made it a medium sized university project. - If I have time I will try to dig deeper. I assume, you dont want to continue here?