PreTeXt: Allow for annotating XML file with a parent xml:id

StevenClontz commented 3 months ago

Right now if I have a file whose root element doesn't correspond to a PreTeXt HTML page, I cannot preview the file while editing.

I propose supporting something like the following to mark up such files so CodeChat knows what to serve up.

<?xml version = "1.0">
<!-- codechat: parent-section-id -->
<subsection xml:id="not-its-own-page">
  ...
</subsection>

bjones1 commented 3 months ago

That's a good idea and would certainly make the CodeChat user experience better. It should be pretty easy to do, although I suspect keeping this as some sort of XML element would be better. The code that connects XML IDs to filenames is in the CLI, in pretext/codechat.py starting at line 94. That code looks for XML IDs then for matching generated files. Adding another loop that looked for a specific element then added that mapping to the file should work. What are your thoughts?

StevenClontz commented 2 months ago

How about this then?

<?xml version = "1.0">
<subsection xml:id="not-its-own-page" codechat-id="parent-section-id">
  ...
</subsection>

Then I think the xpath query you want is f"//*[@{xml_id_attrib}|@codechat-id]")

bjones1 commented 2 months ago

That sounds great! @rbeezer, and objections? We'd like to add an optional codechat-id attribute to any structural division, although mostly likely at the <subsection> and <subsubsection> level. PreTeXt should ignore this when producing output (which I assume will take no changes to the code, though will need changes to the schema).

rbeezer commented 2 months ago

I was musing just this morning about "application specific" markup going into the schema, in a totally unrelated scenario.

For any division, you can find out anything about its parent division, which will just be its (generic) parent.

parent::*/@xml:id

It doesn't seem right to me that authors will need to specify this information (and keep it correct), but perhaps I do not understand the scenario relative to which files are being edited and viewed.

StevenClontz commented 2 months ago

The context is that CodeChat isn't parsing the whole document (I think). So if an author is editing a <subsection> root element that is xincluded into the book, CodeChat needs to know the xml:id of the section (or whatever chunk) so it can display the right HTML page.

StevenClontz commented 2 months ago

(So parent won't be available.)

StevenClontz commented 2 months ago

Namespaces are obnoxious, but is this the place for one? codechat:id? Does the schema balk if an attribute is part of another namespace?

rbeezer commented 2 months ago

xml:id is namespaced, and the schema is fine with it (but perhaps it is special). We need an html namespace And David Austin's prefigure library will need one. So they are essentiial and not obnoxious.

In any event, filenames are now manufactured from @label and the preprocessor does a backwards-compatibility maneuver to let @xml:id behave historically. So the mappings between to are not so straightforward. To say nothing of automatic filenames (which I presume you are expecting authors/editors to avoid).

CodeChat is editing a file and needs to view a file. Why not maintain a mapping of edited files to appropriate viewable files as a seperate CodeChat-specific configuration, and leave (generic) PreTeXt markup (and XSL!) out of it? It'll be almost as much effort, and will be more reliable if it does not have to mimic PTX creation of filenames.

bjones1 commented 2 months ago

The context is that CodeChat isn't parsing the whole document (I think). So if an author is editing a <subsection> root element that is xincluded into the book, CodeChat needs to know the xml:id of the section (or whatever chunk) so it can display the right HTML page.

In this instance, the code that generates a mapping between source files and XML IDs runs in the context of the entire document; this code is part of the CLI. Currently, the CLI code looks for xml:id instances and looks for a correspondingly-named .html file. Then, CodeChat reads this file to determine what HTML file was produced by the currently edited file.

CodeChat is editing a file and needs to view a file. Why not maintain a mapping of edited files to appropriate viewable files as a seperate CodeChat-specific configuration, and leave (generic) PreTeXt markup (and XSL!) out of it? It'll be almost as much effort, and will be more reliable if it does not have to mimic PTX creation of filenames.

I agree -- all we need is a mapping from source PTX file names to generated HTML filenames. However, I doubt that authors will have the desire or expertise to maintain this mapping manually. If we can autogenerate the mapping from the PreTeXt XML, then this improves usability significantly. The current process does some of the autogeneration, but not all of it. I'd like to find ways to improve this autogeneration process.

rbeezer commented 2 months ago

Run the pretext/pretext script to produce the "assembly-dynamic" output. One massive XML file, where versions, etc have all been considered. Lots of new, manufactured attributes. This is very fast, it could be done at startup via routines in the module, maybe on its own thread.

In pretext-common.xsl find mode="containing-filename" This will produce the output filename that holds some node (e.g. a file for the chapter containing a subsection).

For more, chase back to template mode="visible-id" in same XSL file. It'll use the @unique-id attribute placed by the assembly (pre-processor) step, hiding @label, @xml:id, and automatic filenames. You could make some XSL importing the containing-filename template (most reliable if any future changes) or you could use the XML from the assembly step, do your own ascent up the tree to grab a @unique-id and make a filename.

Note that previous discussion was all about parents. Entirely conceivable that the output file is for a grandparent of the being-edited file, or moreso.

StevenClontz commented 2 months ago

@bjones1 I don't think you're in Tacoma with us next week, but perhaps you'd have time to sync up virtually sometime and knock this out?

bjones1 commented 2 months ago

Sure, I'd like to! I'd definitely appreciate help from the PreTeXt side on this...

StevenClontz commented 2 months ago

Aside: I just had a flash of brilliance(?): I may implement pretext build --codechat this week, which does the "right" build for codechat. That way we can adjust the targets we are building on our end for codechat (and could supply you information about the build perhaps).

bjones1 commented 2 months ago

Sounds exciting!

StevenClontz commented 2 months ago

@bjones1 Here's a pitch: pretext build --codechat XMLID takes the same xml:id you're passing now to pretext build -x XMLID. However, it first does a lookup of the closest ancestor (perhaps itself) that has an xmlid and gets its own page (which I need to investigate). It then asks core pretext to build that xmlid's subtree.

My question: what output do you need that command to emit so codechat knows the right page to preview?

StevenClontz commented 2 months ago

FYI: I'm stuck sick in my hotel room trying to stick to pretext dev while I'm away from distractions (besides my stuffy nose), so I have a lot of flexibility to sync up at your convenience this week.

bjones1 commented 2 months ago

My question: what output do you need that command to emit so codechat knows the right page to preview?

A mapping from every PreTeXt source file in a book to the .html file that it (perhaps partly) produced. (I assume for simplicity that a single source file only produces one .html file, not multiple .html files). That way, CodeChat can show the appropriate output file for the file an author is editing.

bjones1 commented 2 months ago

FYI: I'm stuck sick in my hotel room trying to stick to pretext dev while I'm away from distractions (besides my stuffy nose), so I have a lot of flexibility to sync up at your convenience this week.

I'm happy to meet! I'll watch Discord.

StevenClontz commented 2 months ago

@rbeezer I'm having trouble importing pretext-common.xsl:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:import href="{ptx_common_path}"/>
 <xsl:output method="xml"/>
 <xsl:template match="node()|@*">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>
 <xsl:template match="*[@unique-id]">
  <xsl:copy>
    <xsl:attribute name="output-filename">
     <xsl:apply-templates select="." mode="containing-filename"/>
    </xsl:attribute>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>
</xsl:stylesheet>

lxml.etree.XSLTApplyError: Evaluating global variable  var/param being computed failed

Any clues? Removing <xsl:import href="{ptx_common_path}"/> (yes, Python is inserting the correct path) removes the error (but of course does not allow me to use the containing-filename template.

bjones1 / CodeChat

PreTeXt: Allow for annotating XML file with a parent xml:id #21