pkp / pkp-lib

The library used by PKP's applications OJS, OMP and OPS, open source software for scholarly publishing.
https://pkp.sfu.ca
GNU General Public License v3.0
304 stars 444 forks source link

Add basic support for JATS XML files to publications #7505

Closed NateWr closed 9 months ago

NateWr commented 2 years ago

Describe the problem you would like to solve Many journals create JATS XML files that they want to distribute with their published articles. These files may be created through automatic conversion tools, manual editors, or third-party typesetting services. To distribute these files, most journals use OJS's galley system. They create a separate XML galley and upload the appropriate file(s).

This creates several challenges when it comes to distributing the XML file or generating full-text galleys from it. There is no special XML galley type, so when distributing XML via OAI or to other downstream services, plugins must try to identify the correct XML file to use. If a journal has uploaded two XML files for a submission, this might cause bugs.

Describe the solution you'd like Extend the publication object to support attaching a single JATS XML file to each version of an article. By having a single place for a JATS XML file, the application will make it easier to generate full-text galleys, sync metadata from OJS, and distribute JATS XML to third-party services.

Who is asking for this feature? This is a requirement for future work to support the production of JATS XML in OJS. This will provide a single place in the UI to integrate the functionality of existing JATS plugins. And future work, such as integration of the Libero editor and generating of full-text HTML, will also benefit from this.

Additional information The following mockups show how OJS could integrate a special place in the UI for working with JATS. A new tab is added to the publication forms where the user can upload a JATS XML file.

XML - No File

When a file is uploaded, it could be downloaded, deleted or replaced.

XML - Has File

In the future, additional tools to convert files to JATS or generate JATS from OJS metadata could be integrated with this UI.

XML - Create XML


PRs OJS: pkp/ojs#4109 OMP: pkp/omp#1490 OPS: pkp/ops/pull/604 PKP-LIB: #9536 UI-LIBRARY: pkp/ui-library#300 JatsTemplate: pkp/jatsTemplate#42


Tests Fixes OJS: pkp/ojs#4125 TESTS ONLY UI-LIBRARY: pkp/ui-library#306 MERGED

PKP-LIB: #9581 MERGED OJS: pkp/ojs#4127 TESTS ONLY

PKP-LIB: #9583 MERGED OJS: pkp/ojs#4128 MERGED OMP: pkp/omp#1494 CLOSED OPS: pkp/ops#610 CLOSED


asifdev124 commented 2 years ago

Expected when we can see the xml file generation feature? @NateWr

asmecher commented 2 years ago

@asifdev124, the generation of JATS XML is out of scope for this issue. I'd suggest looking for more information in our support forum, e.g.: https://forum.pkp.sfu.ca/t/who-is-who-in-jats-2019/57063

johanneswilm commented 2 years ago

Is there a particular sub type of JATS that you are looking to support? I am asking so I can see if we can support that type in the Fidus Writer exporter.

NateWr commented 2 years ago

Hi @johanneswilm! Our goal in this particular issue is to work with any kind of JATS. In other words, a journal should be able to upload any valid JATS XML file here. OJS can then distribute that JATS file wherever appropriate (eg - OAI endpoints, Crossref deposits, etc). So the constraints on the type of JATS will depend on how the journal wants to use it.

At a later date, we plan to do things like integrate support for XML editors, publish from XML to full-text, etc. In such cases, it will matter a lot what kind of JATS XML we are working with. We haven't yet settled on a standard. And our goal with the approach defined in this issue is to ensure that our journals are not forced to use one editor or another.

However, it's still our hope that Libero Editor will mature into a web-based editor that we can integrate closely with. If that turns out to be the case, we'll build other capacities (like generating full-text from XML) around whatever JATS syntax that editor outputs.

If you want to pursue this approach, I think the best thing to do would be to try to ensure that Fidus Writer and Libero Editor could convert back and forth between them. Or that Fidus Writer could output a format that Libero Editor can open.

That said, we are a long way from having a production-ready version of Libero Editor. I'd be delighted if journals could pick up Fidus Writer and generate XML that they can upload to OJS. Once the feature described in this issue is in place, it might be a good idea to explore what a plugin could do to improve such an integration, such as rendering to full text HTML from a XML file generated by Fidus Writer. I'd be happy to discuss further with you what that might look like / require on our end. Our community is hungry for tooling that they can pick up and work with right away.

marcbria commented 2 years ago

Maybe I'm wrong, but I think it's one of those situations where the technical look makes us forget about functionality.

I think that for editors it should not be important what markup language is used internally (JATS, TEI, latex...) as most of them are not going to touch the sources directly.

I think that was the main mistake of Marcalyc (from Redalyc) with a complex editor for markup and (if I remember well) also of Scielo with their macros for word.

I don't see that adding a new tab in Publication an improvement for publishers. In fact, I think it will increase their anxiety about a problem for which we don't yet have a solution.

It's like telling them "hey, we're adding a new step in the workflow to work with JATS" but we're not giving them the tools to do it?

I like best Alec's approach in #6825 making this optional (and disabled by default). (although I can't tell is is better keep it in Publication or as Alec proposes).

In my view, the blocker for a JATS workflow in OJS is the lack of a reliable JATS editor. In this sense, although Texture is a dead end, I think it serves as a prototype of what a JATS editor should be like, because:

The editor is an essential piece in this process because:

NateWr commented 2 years ago

making this optional (and disabled by default)

We won't show the JATS XML in a journal that hasn't enabled it.

In my view, the blocker for a JATS workflow in OJS is the lack of a reliable JATS editor.

That's correct for a fully integrated JATS workflow. However, one of the goals of this issue is to support journals that already use external JATS workflows. This provides a way for them to bring the XML back into OJS, and gives us an anchor around which we can build tools for using that XML in OJS (distributing it in OAI, showing full-text HTML, generating PDFs, etc).

Journals are already doing this, but are forced to try to integrate with the Production Files and Galleys in ways that those data models never intended. An editor, like Texture, can still be integrated with this UI, giving us an opportunity to better support both use cases.

I think that for editors it should not be important what markup language is used internally (JATS, TEI, latex...) as most of them are not going to touch the sources directly.

This is true for what we are calling a document-centric workflow where the document exists in a source language and all activity (review, copyediting, etc) is performed on the document. That's not what we're proposing here and OJS is still many years away from something like that.

What's proposed here is a more limited feature for bringing JATS into the distribution process. Many journals are actively seeking to distribute JATS and it is treated here not as a source file but as a distribution file. For that reason, it's important to name it what it is, rather than hide that from the editor.

nils-stefan-weiher commented 1 year ago

Dear @NateWr ,

how does this relate to publication formats and galleys?

It looks good, but it seems completely parallel to the content of file genres, galleys and publication formats (for OMP).

I feel the question is: Why add another place to upload a file?

EDIT: More questions: How is the workflow for the stages incorporated? Because the publication is at the end of the workflow, does this mean the uploaded JATS XML is the production ready XML?

NateWr commented 1 year ago

As far as I'm aware, JATS isn't related to OMP since it stands for Journal Article Tag Suite. I know there is a BITS format, but I don't think that's a part of our plans for now.

nils-stefan-weiher commented 1 year ago

Oh sorry I meant BITS as this is an extension to JATS.

Colleagues from Heidelberg unversity publishing are working on transforming XML publications from a JATS dialect, which was used with a customised Lens viewer (@withanage Dulip's work), to BITS XML.

I was thinking that JATS is just a set of specific XML tags and we also for example use TEI XML and a custom viewer in the frontend for a journal working with ancient texts: https://journals.ub.uni-heidelberg.de/index.php/pylon/issue/view/6131

EDIT: Upon reading the comments today. The point I was going on about was about naming the Tab "JATS XML" If this should contain the published XML. Why not just call it XML Source? or something like it?

withanage commented 1 year ago

@nils-stefan-weiher

The Plugins from Heidelberg or the lens Viewer for BITS are rednering galley XMLS only.

I think, the focus of this ticket is not the rendering of JATS , but add the basic JATS support for publication objects , which then can be used by either OJS or third-party services.

defstat commented 10 months ago

@asmecher I have concluded checking the initial review comments and making some changes and some more fixes.

asmecher commented 10 months ago

@defstat, I've reviewed your PRs again, thanks!

asmecher commented 10 months ago

Thanks, @defstat, I've added a final couple of comments to the pkp-lib PR, but otherwise it's ready to go. Just waiting on Jarda's feedback on ui-library.

defstat commented 10 months ago

@asmecher @jardakotesovec I have resolved all review comments, but I think that because of that change and this rearrangement another round of review is in order.

I have also added PRs for OMP and OPS - they contain only the highlight.js dependency addition for npm

defstat commented 9 months ago

@asmecher @jardakotesovec I addressed all review comments, rebased on latest main branch and forced pushed everything.

asmecher commented 9 months ago

All merged -- thanks, @defstat!