substance / dar

Reproducible Document Archive
82 stars 9 forks source link

Why not use NoteBook formats, like Jupyter? #4

Open jobdiogenes opened 6 years ago

jobdiogenes commented 6 years ago

From what I understood, Dar is format that add data and code for scientific publishing. Jupyter Notebook do this same thing.

Why not use Jupyter?

michael commented 6 years ago

The Jupyter Notebook format does not model some contents which are important for an article to be considered a full-fledged manuscript (e.g. reference list, abstract with translations, journal specific metadata etc.). That's why we want to take it one step further and build on top of JATS-XML (an already established format at scientific journals). Additionally, we need to allow publications that consist of multiple documents (manuscript, notebooks, sheets) and assets (such as images, videos etc.).

Our ultimate goal: Create a reproducible publication (.dar) and submit it to a journal, which can review and publish the work as fast as possible (possible by using a Dar-compatible toolset).

We explained that in more detail in a recent webinar: https://youtu.be/oyBX9l9KzU8?t=446

The Jupyter team is actively involved in the discussion and we are considering different options make things interoperable:

Here's a Google Doc where we are discussing this:

https://docs.google.com/document/d/1zIYXpbeUpFvfV5W0DR4S9Pd0PSm-nM5tYZIBqN2U2Zk/edit?usp=sharing

michael commented 6 years ago

Also see this recent demo of such a reproducible publication in action:

http://builds.stenci.la/stencila/reproducible-publication-example-2018-04-16-dcf17f9/example.html?archive=repro-pub

(note this is still a work in progress, and Python, R, etc. contexts are not exposed in this demo to keep things lightweight)

jobdiogenes commented 6 years ago

Thanks Very Much, @michael.

You help me a lot. I'm a IT gui for (www.nupelia.uem.br) which could be translate as Research Institute for Limnology Ichthyology (Fish) and Aquaculture. And its was great to know in the video that one of the starters of Stencila works in fisheries research.

I think that Stencila is really in the way to fill the gap for reproducible in publishing research. I just make a fork in github. And I will try.

At now I'm working to deploy FidusWriter connected with OJS (already working) and generate Scielo Schema (JATS derivated). Then people from FidusWriter appoints me to Dar and [https://www.le-tex.de/en/transpect.html].

I think that FidusWriter did not has the same goal as Stencila (reproducible and live documents), but, could be a step forward from our currently and common workflow manual (.doc/.odt) and OJS.

As I understand, Dar format could be used as intermediate to generate JATS ?. You think using DAR, I could address to generate a Scielo JATS XML Scheme ?

Thanks again.

michael commented 6 years ago

Hi @jobdiogenes the Dar format actually IS valid JATS. We just enforce a stricter tagging schema. Once we have all use-cased covered SciELO, Erudit, eLife and others will be using Dar as the primary specification. It's actually based on SciELO's schema we just try to make it generic where it is necessary.

Short-term you'll be able to create and quality check static manuscripts (Texture editor). We are expecting a first stable release by September 2018. Long term (once majured), Stencila and other tools (probably even Jupyter) will allow creating reproducible publications, that can be accepted by journals directly.