dhlab-epfl / dhwriter

11 stars 2 forks source link

TEI Lite import? #3

Open cmsmcq opened 11 years ago

cmsmcq commented 11 years ago

If I understand things correctly, dhwriter provides an interface for editing TEI Lite documents (or HTML documents whose elements are in a simple 1:1 relation to the elements of a TEI Lite document).

It would be convenient to have a way to import a TEI Lite document -- at least, for those of us who draft documents in TEI Lite. And if (as seems plausible) what is desired is a particular subset of TEI Lite, it would be convenient to have a schema (in DTD, Relax NG, or XSD form) for the supported subset.

cyrilbornet commented 11 years ago

Of course :-) Do you perhaps happen to know of a XSL stylesheet that would do the job and run on PHP's embedded XSLT processor? I found a few for the export but in the end none worked as expected, so I had to write one from scratch...

Anyway, it is on my list of tasks. If I have the time to give it a try I'd also want to provide a minimal implementation LaTeX import/export.

cmsmcq commented 11 years ago

If you can give me (or point to) a DTD (or a reliable and complete prose description) of the markup structure the editor uses internally, and optionally a DTD showing the subset of TEI Lite you'd like to be able to accept, then I would be very happy to write an XSLT 1.0 stylesheet. (We will have to check to make sure Sablotron can handle what I write; last time I tried it, its coverage of XSLT 1.0 was incomplete. But I hope we can work around any gaps.) If you don't have a defined subset of TEI Lite, I'll just do the best I can.

The XSLT stylesheet you are using for export would be a helpful guide; am I right to think that /dhwriter-master/data/tei.xsl is the export stylesheet you refer to? I'll use the set of HTML elements and attributes it handles as an initial guide to the set of HTML elements and attributes the import stylesheet is allowed to produce. (Am I right to think that the HTML emitted by the HTML button is identical, or isomorphic, to what you want as output of an import stylesheet?)

I'll let you know when I have anything; it may be too late for the 1 November deadline of this year, but I'll try. And there's always next year, I hope.