Klortho / eutils-org

Project to produce RDF output for some NCBI E-utilities
5 stars 2 forks source link

Split the XSLT #25

Closed Klortho closed 10 years ago

Klortho commented 10 years ago

Leyla wrote:

We will have 3 XSLTs:

  • A master XSLT that will take care of the schema itself, so all the URI generation based on parameters will be centralized here
  • A metadata XSLT that will take care of transforming the metadata (Work, Expression, Journal, Authors, References, etc.)
  • A structure XSLT that will take care of transforming the paper structure. In order to start this one, we will share the structure model used in Biotea (as it is now and as it would be with the Open Annotation). We definitely need a new issue (at least one) for the structure & full-context process.
  • Yes, we say 3 but not sure about the content extraction for annotation purposes. For an annotation, we need the section (could be subsection as well) it belongs to and the text, ideally without the XML tags. The main text in in the paragraphs in the main content and appendixes, but we also have the tables. So, could an XSLT help to get the text to annotate and the URI section it belongs to? any thoughts?

Silvio wrote:

I think that the master XSLT should take care of the label schema too. Thus, yes, we need a schema for the label too, but I would like to wait that the schema for the entities is ready, before starting to think about a label schema.

Klortho commented 10 years ago

Here's the scheme I've come up with. Not sure if it's exactly what you guys had in mind or not.

jats2spar.xsl - main "controller" template -- imports jats2spar-utils.xsl - "global" parameters, variables, named template, and functions -- imports jats2spar-meta.xsl - "match" templates that recurs over the front matter, and transform it into metadata (SPAR) RDF -- imports jats2spar-content.xsl - "match" templates for the document body, full text content.

ljgarcia commented 10 years ago

Hi Chris,

I think it would be more like:

-- imports jats2spar-utils.xsl - "global" parameters, variables, named template, and functions -- imports jats2spar-meta.xsl - "match" templates that recurs over the front matter, and transform it into metadata (SPAR) RDF -- new imports jats2spar-structure.xsl - "match" templates for the document sections and subsections -- imports jats2spar-content.xsl - "match" templates for the document body, full text content. --> and then we need to define what in the full content we want to deal with. In principle paragraphs and table content.

There is another comment from Silvio about https://github.com/Klortho/eutils-org/issues/26, the structure and content will be clarify there.

Cheers, Leyla

On Sat, Feb 22, 2014 at 2:35 PM, Chris Maloney notifications@github.comwrote:

Here's the scheme I've come up with. Not sure if it's exactly what you guys had in mind or not.

jats2spar.xsl - main "controller" template -- imports jats2spar-utils.xsl - "global" parameters, variables, named template, and functions -- imports jats2spar-meta.xsl - "match" templates that recurs over the front matter, and transform it into metadata (SPAR) RDF -- imports jats2spar-content.xsl - "match" templates for the document body, full text content.

Reply to this email directly or view it on GitHubhttps://github.com/Klortho/eutils-org/issues/25#issuecomment-35803943 .

Klortho commented 10 years ago

Hi, Leyla, Yes, that's fine. This ticket is just about how to split the XSLTs into the most logical pieces, and it seems you agree with my plan. I got a start on this last weekend, and should be able to finish it up soon.