id for split p with permid

PreTeXtBook / pretext

PreTeXt: an authoring and publishing system for scholarly documents

https://pretextbook.org

Other

266 stars 208 forks source link

id for split p with permid #978

Closed davidfarmer closed 2 years ago

davidfarmer commented 5 years ago

When a PTX p with permid contains display math, the HTML contains two "p". The second one does not get an HTML id. It would be good for it to have one, and for it to be derived from the permid.

Since permids are only letters, one possibility is to append a number or character.

Maybe appending 0, 1, 2, ... when a PTX p leads to several HTML p (such as when there are multiple display equations in the source p).

rbeezer commented 5 years ago

I think it will be a simple matter to count how many pieces have come before, so a number will be possible.

Sometimes an intermediate HTML "p" is empty and not output. I presume the numbers need only be unique, and not consecutive?

Any punctuation in the appendage? aBc-32 or aBc32?

Thinking as I write. We don't need these for the tree of id's? Since the author only gets to influence the permid on the PTX "p"?

Do you want an AATA once I do this?

On 12/4/18 1:37 PM, David W. Farmer wrote:

When a PTX p with permid contains display math, the HTML contains two "p". The second one does not get an HTML id. It would be good for it to have one, and for it to be derived from the permid.

Since permids are only letters, one possibility is to append a number or character.

Maybe appending 0, 1, 2, ... when a PTX p leads to several HTML p (such as when there are multiple display equations in the source p).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/rbeezer/mathbook/issues/978, or mute the thread https://github.com/notifications/unsubscribe-auth/ABy2ctbOelPSNlL6mZt0aWkyrCC5sZpxks5u1usmgaJpZM4ZBjbv.

davidfarmer commented 5 years ago

numbers need only be unique, and not consecutive?

Unique is all we need, as long as it it deterministic and only depends on the structure of that one p.

Any punctuation in the appendage? aBc-32 or aBc32?

No punctuation, just aBc6

Thinking as I write. We don't need these for the tree of id's? Since the author only gets to influence the permid on the PTX "p"?

Not a part of the tree. This is needed due to the design of HTML and is not part of the official PTX source.

I can keep working without this. I just happen to notice it as I was working on the tracking code for knowls.

rbeezer commented 5 years ago

On 12/4/18 2:21 PM, David W. Farmer wrote:

Thinking as I write. We don't need these for the tree of id's? Since the author only gets to influence the permid on the PTX "p"?

Not a part of the tree. This is needed due to the design of HTML and is not part of the official PTX source.

Right. Tree comes from the source, and this is output.

I can keep working without this. I just happen to notice it as I was working on the tracking code for knowls.

Perhaps tomorrow if it is as straightforward as I expect.

Rob

rbeezer commented 5 years ago

A PTX p is decomposed into a sequence of HTML p when it contains various display chunks, such as lists and display math. PTX considers a paragraph somewhat atomic, and the only interior parts you can cross-reference are the individual equations of multi-line display mathematics.

If a p starts with something like a ol there is now no leading HTML p since it would be empty. So the PTX id of the overall PTX p is placed on the first HTML object, the ol in this example.

Now consider migrating extended permid from PTX to all of the constituent parts of the HTML decomposition. The display items interior to the paragraph (such as the ol) already have permid in the source and so have these as HTML id in the output.

In the case of an initial empty HTML p in the decomposition where does the permid of the overall PTX p go? I don't think we can simply abandon it, since there will be cross-references to the overall p, such as from the index.

davidfarmer commented 5 years ago

I have never considered it reasonable to start a p with an ol Same for starting a p with a me or md. (I refer to regular running text. Nested lists may have special requirements.)

But since we seem to have to decision have to decide what to do, to me the natural think is for the PTX permid to migrate to the HTML p that occurs immediately after the me or ol.

The me or ol just uses the permid which was given to it in the source.

If the p contains nothing more than an me or ol, so there is no HTML p after it, then that permid is lost. Too bad.

All that matters is that the code be logical and consistent. No need for heroics to support use cases that should not occur.

Am I missing something?

On Wed, 5 Dec 2018, Rob Beezer wrote:

A PTX p is decomposed into a sequence of HTML p when it contains various display chunks, such as lists and display math. PTX considers a paragraph somewhat atomic, and the only interior parts you can cross-reference are the individual equations of multi-line display mathematics.

If a p starts with something like a ol there is now no leading HTML p since it would be empty. So the PTX id of the overall PTX p is placed on the first HTML object, the ol in this example.

Now consider migrating extended permid from PTX to all of the constituent parts of the HTML decomposition. The display items interior to the paragraph (such as the ol) already have permid in the source and so have these as HTML id in the output.

In the case of an initial empty HTML p in the decomposition where does the permid of the overall PTX p go? I don't think we can simply abandon it, since there will be cross-references to the overall p, such as from the index.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.[AAM6LJtSjA7J20D9k3nWxbXnwL3fOqoaks5u2AzBgaJpZM4ZBjbv.gif]

rbeezer commented 5 years ago

Dear David,

It was a real fiddle to get the id to migrate from the empty leading p to the subsequent display. I can't see immediately how to get the permid to migrate to the second p after a leading empty p and then begin counting properly for the derived permids being developed here.

I frequently have a p whose only content is a simple list. The alternative is a list element which gets a number and a caption. I agree that a proper paragraph should not begin with a display. I think the decision to place these displays inside a p was the right one, but it will allow constructions that are sub-optimal. I don't want to tell the reader they can't annotate a list because the author did a poor job.

Is there an empty element we can place instead of the leading empty paragraph, as a target for cross-references? Maybe a span or an anchor of some sort? Perhaps with a class so that it takes up no visible space? That would make the code cleaner rather than more convoluted, allowing for the removal of the fiddle above.

Search for mode="insert-paragraph-id" in mathbook-ntml.xsl to see the current state of complications for this.

Rob

rbeezer commented 5 years ago

About halfway through trying to put id onto non-trivial consituent p. First, and only one, belonging to the logical p in source. All the subsequent ones being manufactured/derived.

This is not just an addition, but a change. Right now the id from the paragraph goes on the first constituent of the logical paragraph, maybe a list or display math. These are containers so have not ever been targets and so do not have assigned xml:id. Now that they get permid in output, we can't overwrite those permid with the paragraph id.

If the p contains nothing more than an me or ol, so there is no HTML p after it, then that permid is lost. Too bad.

Then we have lost the ability to cross-reference p - in other words they can't be targets. Every index entry placed within a "top level" paragraph will have a non-functional "in-context" link.

Every time I work on this, I come back to the same reliable solution.

Put a div around the contents of the PTX p. Place the id of the paragraph on the div. Style with spacing identical to that of a "normal" paragraph. Works well as a target, including pink-flash I'd guess, and maybe will facilitate highlighting across the pieces when ready to implement this?
Give lists, display math, display verbatim their new permid.
Non-trivial interstitial material becomes HTML p. Manufacture permid easily based on count of the pieces. Nothing fancy to differentiate forts from remainder. Style these p so the contents of the div looks like one paragraph, not several.

I'll see about making a beta of this so we can explore consequences.

davidfarmer commented 5 years ago

Can you point to a couple of places in the sample article where I can take a look and work through the details? I still have misgivings about major structural changes to the HTML, when the only change I am aware of needing is to add an HTML id to a p what does not have one. And that id can be derived by looking at the previous p (and the one before that) until finding a p that does have an id.

If that approach fails for a PTX p that starts with display math instead of text, then good: nobody should do that.

If that approach fails for a PTX p that starts with a list, then good: that p should only contain the list, not the list and then text.

On Sun, 17 Feb 2019, Rob Beezer wrote:

About halfway through trying to put id onto non-trivial consituent p. First, and only one, belonging to the logical p in source. All the subsequent ones being manufactured/derived.

This is not just an addition, but a change. Right now the id from the paragraph goes on the first constituent of the logical paragraph, maybe a list or display math. These are containers so have not ever been targets and so do not have assigned xml:id. Now that they get permid in output, we can't overwrite those permid with the paragraph id.
  If the p contains nothing more than an me or ol, so there is no HTML p after it, then that permid is lost. Too bad.
Then we have lost the ability to cross-reference p - in other words they can't be targets. Every index entry placed within a "top level" paragraph will have a non-functional "in-context" link.

Every time I work on this, I come back to the same reliable solution.

Put a div around the contents of the PTX p. Place the id of the paragraph on the div. Style with spacing identical to that of a "normal" paragraph. Works well as a target, including pink-flash I'd guess, and maybe will facilitate highlighting across the pieces when ready to implement this?

Give lists, display math, display verbatim their new permid.

Non-trivial interstitial material becomes HTML p. Manufacture permid easily based on count of the pieces. Nothing fancy to differentiate forts from remainder. Style these p so the contents of the div looks like one paragraph, not several.

I'll see about making a beta of this so we can explore consequences.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.[AAM6LHJWeocFCuvol-7smlTP8IYQVtcQks5vOfoagaJpZM4ZBjbv.gif]

rbeezer commented 5 years ago

Subsection 12.6 of the sample article is the one place where I know this is done intentionally. Notice that the id of the p ("p-520") is on the ol rigt now.

https://pretextbook.org/examples/sample-article/html/lists.html#subsection-40

I've got several more on a branch right now - I think I can make a beta quickly, so will send that all at once.

davidfarmer commented 5 years ago

Proposed alternate solution:

Javascript scans the page looking for a p that does not have an id. Finding one, it looks at previous sibling p until finding one with an id.

From the id it finds, it derives an id for the p that is missing one.

Now everything has an id. This process happens before anything that needs the id on a p.

I can implement if we decide on this solution.

On Sun, 17 Feb 2019, Rob Beezer wrote:

About halfway through trying to put id onto non-trivial consituent p. First, and only one, belonging to the logical p in source. All the subsequent ones being manufactured/derived.

This is not just an addition, but a change. Right now the id from the paragraph goes on the first constituent of the logical paragraph, maybe a list or display math. These are containers so have not ever been targets and so do not have assigned xml:id. Now that they get permid in output, we can't overwrite those permid with the paragraph id.
  If the p contains nothing more than an me or ol, so there is no HTML p after it, then that permid is lost. Too bad.
Then we have lost the ability to cross-reference p - in other words they can't be targets. Every index entry placed within a "top level" paragraph will have a non-functional "in-context" link.

Every time I work on this, I come back to the same reliable solution.

Put a div around the contents of the PTX p. Place the id of the paragraph on the div. Style with spacing identical to that of a "normal" paragraph. Works well as a target, including pink-flash I'd guess, and maybe will facilitate highlighting across the pieces when ready to implement this?

Give lists, display math, display verbatim their new permid.

Non-trivial interstitial material becomes HTML p. Manufacture permid easily based on count of the pieces. Nothing fancy to differentiate forts from remainder. Style these p so the contents of the div looks like one paragraph, not several.

I'll see about making a beta of this so we can explore consequences.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.[AAM6LHJWeocFCuvol-7smlTP8IYQVtcQks5vOfoagaJpZM4ZBjbv.gif]

rbeezer commented 5 years ago

the only change I am aware of needing is to add an HTML id to a p what does not have one

If ol (and similar) will have permid, then the id of the paragraph needs to go somewhere else, it has lost its home. If there are no p in teh decomposition, there is no target anymore.

Authors will make a p and fill it only with a list. I agree that is poor style. But my suggestion is not to enable this, just not brerak it.

rbeezer commented 5 years ago

Two betas. Both with new sample paragraphs in Section 12.6. Old and new. No knowls, no images.

https://pretextbook.org/beta/2019-02-17-old-p/

https://pretextbook.org/beta/2019-02-17-new-p/

Changes:

div.logical-paragraph has the correct id for authored paragraph
ol, me, cd, etc, always interior to authored p, now have correct id
manufactured, new, p interior to authored paragraph have derived id

Quick and Dirty:

derived id are hack - can make consecutive with some care
dashes in derived id, so easier to find, but may be necessary to avoid accidental duplication

Right after id p-520 is where the messy examples are. ancillaries.html at id p-981 is a good "normal" example.

davidfarmer commented 2 years ago

Long paragraphs containing display math or list are now handled differently, so closing.