JabRef / jabref

Graphical Java application for managing BibTeX and biblatex (.bib) databases
https://devdocs.jabref.org
MIT License
3.66k stars 2.59k forks source link

Create an importer for Citavi #8322

Closed Siedlerchr closed 2 years ago

Siedlerchr commented 2 years ago

Is your suggestion for improvement related to a problem? Please describe. As a User I want to migrate from Citavi to JabRef and want to import the data.

Describe the solution you'd like An importer for citavi

Additional context Zotero has an importer https://www.zotero.org/support/kb/import-from-citavi and it seems like Citavi is using some XMl format which could be used to import

lishangyu9 commented 2 years ago

Hello, We are students of Adelaide University studying the course 'Software Process Improvement'. This course focus on contributing to an open-source project. We are interested in contributing to this issue. Can we work on this one? If we can, we would appreciate that if you can give us some advice. Thank you.

Siedlerchr commented 2 years ago

Thanks for your interest! @lishangyu9

The first steps would be to install citavi and add some references and export it. Follow the zotero link to see how to get the xml file.

  1. Analyze the xml file and try to understand the field names.

Context: With the magic of jakarta xml bindings it is possible to generate (called "unmarshaling") java-objects from XML-schemas (XSD files). If you have an XML file only, there are tools that try to generate an XSD schema from XML. And in JabRef we have a Gradle task generateSource that invokes the generation of the java classes.

I already prepared the basics for you: https://github.com/JabRef/jabref/tree/citavixml So you can checkout that branch. You need to execute ./gradlew generateSource to generate the citavi-java objects under src-gen

  1. Create an importer for the citavi XML files (see e.g. Endnote or Modsimporter), or even better for the whole archive which, according to the zotero people, is just a Zip file, so you should be able to read the zip-archives (see e.g https://github.com/JabRef/jabref/blob/75df9a3e3dacde8c986a44777a75596c0f9e84da/src/test/java/org/jabref/logic/exporter/OpenOfficeDocumentCreatorTest.java#L98-L100
  2. In the importer, create a mapping between the fields from citavi and bibtex. The biblatex manual provides an excellent guide to the fields
  3. Add tests. This importer provides a super opportunity for unit and integration tests.
lefeimei commented 2 years ago

Hi @Siedlerchr, I'm a team member of @lishangyu9 's team. We are still working on this issue. What we have done is that we have unzipped the zip file of citivi project and got the XML file and converted it to XSD file.

Now we have a question: we got both the XML files of the demo project created by citavi and the project created by myself. The contents of the two XML files are different so that the XSD files we got are also different. At the same time, the XSD file that you provided is also different from the one converted from demo project XML file. And, different XSD files will generate different functions in the object.

So, which XSD file do we need to use to generate objects? Do we need to use the XSD file which contains more complete fields?

Thank you!

Siedlerchr commented 2 years ago

Hi,

thanks for the feedback! I would suppose you proceed with the version that has the most complete fields. Can you see if there are any differences? Or provide the examples to me

lefeimei commented 2 years ago

Hi @Siedlerchr, thanks for your reply. For example, the XSD file of the demo project has BibliographyGroupingSets, categories, group fields, etc, but the XSD file of the project created by myself does not have these fields. I think the possible reason is that I did not set these things in my project when I add references. What do you think about this? Thank you so much!

Siedlerchr commented 2 years ago

@lefeimei Ah okay, then I would go with the one that has the most data. The more fields it has, the better.

lefeimei commented 2 years ago

Thank you! We will go with the one that has the most data.

lefeimei commented 2 years ago

@Siedlerchr Hi, we are working on writing the logic of mapping from citavi to jabref. We found that we can't get the authors' names of references in the XML file we got from citavi. In the XML file there is a field called "ReferenceAuthors" (it's not included in "References" field), in "ReferenceAuthors" there are no authors' name but some fields called "OnetoN"

截屏2022-05-18 16 48 20

And some other fields like "ReferenceKeywords", "ReferenceEditors", etc have the same issue.

Siedlerchr commented 2 years ago

@lefeimei Thanks for the update! I took a look at the code for the zotero plugin for citavi, and it seems you need to take the second id in the line and get the value from the Persons element:

https://github.com/zotero/translators/blob/1d1ce6111ac17a4064c1d8836f6beec54404c32b/Citavi%205%20XML.js#L424-L438

lefeimei commented 2 years ago

@Siedlerchr Oh OK! I totally understand! Thank you so much!

lefeimei commented 2 years ago

Hi @Siedlerchr , we are almost ready for pulling request. But we found that there is no branch for citavi anymore. Since we forked the jabref repo and worked on citavi branch, so could you please create a new branch for citavi so that we can pull request to that branch? Or do you think there are some other better ways to do that? Thank you so much!

Siedlerchr commented 2 years ago

@lefeimei Great to hear! You should be able to create a Pull Request and then chose the branch from your Fork. That should work.

lefeimei commented 2 years ago

@lefeimei Great to hear! You should be able to create a Pull Request and then chose the branch from your Fork. That should work.

Thanks

lishangyu9 commented 2 years ago

Hi @Siedlerchr, We created the Pull Request. If there is any problem. Please let us know. Thank you for your help.

Siedlerchr commented 2 years ago

Thanks! We will start reviewing the PR asap.