pandoc / scholarly-metadata

Descriptions, schemata, and tools to use scholarly article metadata with pandoc.
MIT License
13 stars 2 forks source link

What is the scope of the metadata schemas that we're defining? #1

Open dhimmel opened 4 years ago

dhimmel commented 4 years ago

@tarleb thanks for setting this up. I wanted to get an idea of what exactly we're trying to accomplish.

default-pandoc-schema.json currently codifies some of the metadata descripbed in the pandoc manual and supported by the official builtin templates. I think it makes sense to continue to expand this to include all of the fields described in the Pandoc manual or implicitly in its codebase.

However, I believe we also have larger ambitions in terms of creating a metadata schema that is more flexible and featurefull than the current implied pandoc metadata schema. I'm guessing that we'll want to define fields prior to them being officially implemented in pandoc? Perhaps we want to define some fields that never are implemented in Pandoc but are still standardized such that templates / filters can converge on a common vocabulary. So will we have multiple schemas?

@tarleb what're your thoughts and what do you think is the best way to proceed?

tarleb commented 4 years ago

I think we have mostly the same expectations and you pretty much covered it. As the primary goals, I see

  1. arriving at a schema which can be used by authors of scholarly articles and
  2. a schema which contains all information necessary to be consumed directly by manubot, JOSS/whedon, etc.

The first schema is relevant for author convenience, with the second mattering to template writers only. Still, having both fixed would allow for the development of a common tool-chain to transform from the first to the second.

I would expect most results from this repo to be kept separate from main pandoc, although I could see support being added to the official Docker images.

Other ideas, in decreasing order of perceived importance:

A good first step might be to collect some test data which has all the information currently relevant to popular tools and identify common fields. Alternatively, we could start by stepping through the info encodable in JATS article-meta elements and decide which fields we'd like to support. I prefer the example-driven approach, personally.

dhimmel commented 4 years ago

A good first step might be to collect some test data which has all the information currently relevant to popular tools and identify common fields.

Agreed. I mention some existing metadata examples in https://github.com/manubot/manubot/issues/187#issuecomment-567118953. Do we want to do compile these in a GitHub Issue or create .md / .yaml files in this repo with all existing instances we find?

I am not sure what you mean by test data, because we don't intend to support all these implied schemas... but just to create a schema that accommodates the important information.

Alternatively, we could start by stepping through the info encodable in JATS article-meta elements and decide which fields we'd like to support

I feel like JATS meta elements should be one of the examples.

jcolomb commented 3 years ago

ping @crsh and his papaja project. ping also @marton-balazs-kovacs, @alexholcombe for the tenzing app.