wf49670 / ppgen

Post-processing generator for DP
6 stars 4 forks source link

Add new functionality to .sn command #76

Closed davem2 closed 9 years ago

davem2 commented 9 years ago

A couple ideas for extending the capabilities of the .sn command have been mentioned on the forums:

The align parameter should be simple enough (if its wanted), but I foresee some challenges with allowing .sn commands to be located mid paragraph.

davem2 commented 9 years ago

Here's my current plan on implementing the mid paragraph .sn feature Tom requested:

first pass:
.sn found outside of paragraph 
    handled through current implementation
.sn found inside of paragraph (does a flag like this exists already? need a way to check if current line being processed is inside paragraph or not)
    rewrite ".sn Sidenote Text" command as @Sidenote Text@ where '@' is an unused unicode char 

second pass:
@Sidenote Text@ is found
    for text:    
        insert the line [Sidenote: Sidenote text] at the start of the current paragraph
        output current line with @Sidenote Text@ removed
    HTML:
        replace @Sidenote Text@ with <span class="sni"><span class="hidev">|</span>Sidenote Text<span class="hidev">|</span></span>

Any thoughts? I need to familiarize myself with the multiple passes ppgen makes, when does line wrapping take place?

It would be better if .sn commands could be handled the same. I have them seperate because outside a paragraph needs to be in a

while inside needs to be a . Instead of handling them through seperate methods, in one of the later passes 's could be converted to
as needed.

I remember having a similar situation like this #21 where page numbers would sometimes generate a "naked" . Roger was not able to get a solution working and we settled on generating a warning message, but it looks like Roger or yourself got something working since then so we could do something similar here.

wf49670 commented 9 years ago

Remember that in Git comments you need to use & lt; (without the space) for the < character, David :)

Passes: First thing to know: Latin-1, UTF-8, and HTML outputs are done as separate passes over the source, reloading the source for each.

For each text pass we have: preprocess -> preprocesscommon process postprocess

For the HTML pass we have: preprocess -> preprocesscommon process postprocess deStyle (which creates the dynamic classes c### makeHTML -> doheader, dofooter, placeCSS, cleanup doChecks

The handling of .bn happens in preprocesscommon, which is where I would suggest having .sn processed, too, to turn it into the "encoded" form using a unicode character, the text, then the same unicode character again.

Then, in process: (a) for text: (a1) look for the unicode character as we look for .bn info, around line 2617. If found, generate the "between paragraphs" sidenote as you do today.

(a2) However, if you hit normal text, you'll fall into the call to self.doPara() at line 2629. 

(b) For HTML, as we do around line 5525, you'll detect the .sn info based on the unicode character and generate the "between paragraphs" HTML version of the sidenote. Or, you'll fall into the doPara call.

In doPara (text) you'll need to figure out what to do with an in-paragraph sidenote.

In doPara (HTML) you won't do anything; just let the .sn info wrap into the paragraph. For HTML the work for an in-paragraph set of .sn info will happen in postprocess, as it does for page numbers, around line 3432. You'll look for the unicode character + text + unicode character and if found turn it into the appropriate .

That's my set of thoughts for now.

Thanks, Walt

wf49670 commented 9 years ago

For a unicode character to use for marking the .sn info I suggest \u24EB NEGATIVE CIRCLED NUMBER ELEVEN ⓫ (I'm not sure if that will show up properly in this message or not., so here's a link to a site that will show it to you: http://www.fileformat.info/info/unicode/char/search.htm?q=\u24eb&preview=entity

wf49670 commented 9 years ago

A comment on the potential "align=" operand: I'm not certain it's worthwhile. Books often have the sidenotes floated to the outside edge or the inside edge (thus alternating floating left or right depending on the page they're on) but we don't have physical pages in a browser or in the text format, and having them alternate as one scrolls the browser window could be annoying. Even worse, perhaps, we can't predict how an ereader device will paginate, and we could have them on the "wrong" side, or even alternated on the same page (one near the top of the displayed "page" on one side, and one near the bottom on the other side).

So, perhaps just leaving them floated left is fine. Or we might allow a special register via .nr that would set them for the entire book to either be on the left or the right, as the PPer prefers. (Having a register is probably where I would start.)

But if we want to have an "align=" operand on .sn, we may need to consider implementing .sn as: .sn sidenote text .sn-

Walt

wf49670 commented 9 years ago

Another special register could also be used to specify the sidenote min/max width (perhaps defaulting to 9em as in the Wiki). Or, on its preprocessing pass through the text, ppgen could examine all the sidenote text, find the longest word across all sidenotes, and use that length/2 as the em specification for the min and max width by default.

(Note: I realize that rather than having these special registers we could simply depend on the PPer using .de to adjust the CSS float and width values, but that only covers the HTML/epub/mobi, and not the text version. That's something I keep having to remember, and it has complicated things several times before.)

wf49670 commented 9 years ago

I revised the Sidenote branch to update the version number to 3.46hSn, and I restructured line 5362 (add css for [1500]) for readability.

davem2 commented 9 years ago

(Note: I realize that rather than having these special registers we could simply depend on the PPer using .de to adjust the CSS float and width values, but that only covers the HTML/epub/mobi, and not the text version. That's something I keep having to remember, and it has complicated things several times before.)

How would float left/right or min/max width affect the text version? I assumed that sidenotes in text will always be in the form:

[Sidenote: This is a sidenote]
davem2 commented 9 years ago

Passes: First thing to know: Latin-1, UTF-8, and HTML outputs are done as separate passes over the source, reloading the source for each.

For each text pass we have: ... That's my set of thoughts for now.

Thanks Walt, this info was very helpful.

A comment on the potential "align=" operand: I'm not certain it's worthwhile.

I'll leave this out for now.

wf49670 commented 9 years ago

On 2/10/2015 4:11 PM, davem2 wrote:

(Note: I realize that rather than having these special registers
we could simply depend on the PPer using .de to adjust the CSS
float and width values, but that only covers the HTML/epub/mobi,
and not the text version. That's something I keep having to
remember, and it has complicated things several times before.)

How would float left/right or min/max width affect the text version? I assumed that sidenotes in text will always be in the form:

[Sidenote: This is a sidenote]

Hmmm. Excellent question.

I suppose we don't really need to allow wrapping of the sidenote to a specific width, and that would really complicate things (especially when we allow them inside paragraphs.

For reference, by the way, here's a link to the full HTML of the book that Charlie used when he developed that Wiki page: http://www.gutenberg.org/files/41811/41811-h/41811-h.htm and here's the text version: http://www.gutenberg.org/cache/epub/41811/pg41811.txt

In the text he has sidenotes as [SN: text] and it's just a continuous string of text. When it's within a paragraph it's just right there, inline, with that style of markup and is wrapped to the proper line-length as though it were regular text.

So, we could just keep it simple for the text, and have the PPer adjust the HTML by using ".de sidenote" if he wants them all on the right instead of the left, or all to be a different min/max width. (Or, as I mentioned before, we could calculate a width based on the actual sidenotes in the text during a preprocessing pass.) No registers needed, except for the one you put in for the "Sidenote" label itself for the text.

Walt

davem2 commented 9 years ago

In the text he has sidenotes as [SN: text] and it's just a continuous string of text. When it's within a paragraph it's just right there, inline, with that style of markup and is wrapped to the proper line-length as though it were regular text.

Interesting.. I didn't know that was an acceptable representation in text (it's not valid during formatting rounds). Handling the text versions in this way would simplify implementation considerably. I'll ask Tom if there is some other reason he requested that sidenotes inside of a paragraph be relocated to the beginning of the paragraph.

wf49670 commented 9 years ago

On 2/10/2015 5:35 PM, davem2 wrote:

In the text he has sidenotes as [SN: text] and it's just a
continuous string of text. When it's within a paragraph it's just
right there, inline, with that style of markup and is wrapped to
the proper line-length as though it were regular text.

Interesting.. I didn't know that was an acceptable representation in text (it's not valid during formatting rounds). Handling the text versions in this way would simplify implementation considerably. I'll ask Tom if there is some other reason he requested that sidenotes inside of a paragraph be relocated to the beginning of the paragraph.

Basically whatever the PPer wants is OK for sidenote placement, as far as I know.

The problem with floating them to the top of the paragraph is that many (most?) are location-sensitive, and if you move them you lose their context and much of the meaning.

I kind of like what Charlie did. But if a PPer did not want them embedded exactly that way then perhaps we could allow an option (via a register) to split them differently.

For example, rather than

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor [SN: sidenote text]in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

we might allow the PPer to request

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor [SN: sidenote text] in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Walt