nlbdev / nordic-epub3-dtbook-migrator

Tools for converting between a strict subset of DTBook and EPUB3.
http://nlbdev.github.io/nordic-epub3-dtbook-migrator/
GNU Lesser General Public License v2.1
8 stars 7 forks source link

Updated handling of line-by-line content. #454

Closed kalaspuffar closed 3 years ago

kalaspuffar commented 3 years ago

Hi @martinpub

This is a solution to handle line-by-line content allowing construction with the linenum both inside of the line and outside in the linegroup.

Example of allow constructions:

<div class="verse">
    <p class="linegroup"><span class="line"><span class="linenum">1</span> This is line one</span></p>
    <p class="linegroup"><span class="line"><span class="linenum">2</span> This is line two</span></p>
    <p class="linegroup"><span class="line"><span class="linenum">3</span> This is line three</span></p>
    <p class="linegroup"><span class="line"><span class="linenum">4</span> This is line four</span></p>
</div>

<div class="verse">
    <p class="linegroup"><span class="linenum">1</span><span class="line"> This is line one</span></p>
    <p class="linegroup"><span class="linenum">2</span><span class="line"> This is line two</span></p>
    <p class="linegroup"><span class="linenum">3</span><span class="line"> This is line three</span></p>
    <p class="linegroup"><span class="linenum">4</span><span class="line"> This is line four</span></p>
</div>

<div class="line-by-line">
    <p class="linegroup"><span class="line"><span class="linenum">1</span> This is line one</span></p>
    <p class="linegroup"><span class="line"><span class="linenum">2</span> This is line two</span></p>
    <p class="linegroup"><span class="line"><span class="linenum">3</span> This is line three</span></p>
    <p class="linegroup"><span class="line"><span class="linenum">4</span> This is line four</span></p>
</div>

<div class="line-by-line">
    <p class="linegroup"><span class="linenum">1</span><span class="line"> This is line one</span></p>
    <p class="linegroup"><span class="linenum">2</span><span class="line"> This is line two</span></p>
    <p class="linegroup"><span class="linenum">3</span><span class="line"> This is line three</span></p>
    <p class="linegroup"><span class="linenum">4</span><span class="line"> This is line four</span></p>
</div>

Trying to solve https://github.com/nlbdev/nordic-epub3-dtbook-migrator/issues/449

martinpub commented 3 years ago

Thanks @kalaspuffar,

I think it looks good, but I think we should always require that <span class="linenum"> be children to <span class="line">. What do you think @AndersEkl? That makes more sense semantically I think, that the line number is part of the line contents, however separated from other contents. So I would say in your examples, @kalaspuffar, number 2 and 4 should not be considered valid.

AndersEkl commented 3 years ago

@martinpub I see your point, but having them in separate <span> elements next to each other also has some advantages. I think you'd have more flexibilty with styling options. I don't know if we can say there are any semantic considerations here either way, as <span> has no semantic role in itself and we only add class attributes. Are there any obvious problems with accepting both options?

kalaspuffar commented 3 years ago

Hi @AndersEkl and @martinpub

If we aren't required to support both then parsing and handling of the data will be easier for consumers so if we don't see a real benefit of having both options then I'm all for removing one of them.

Best regards Daniel

AndersEkl commented 3 years ago

@kalaspuffar I see... So, we could see <span class="line"> either as:

How were line numbers handled in DTBook? I don't remember, as we always stripped them away.

martinpub commented 3 years ago

How were line numbers handled in DTBook? I don't remember, as we always stripped them away.

Hmm ... I think my gut feeling is grounded in DTBook semantics, but I'm not sure. Will check it and get back here.

martinpub commented 3 years ago

2015-1 guidelines:

3.1.4.18 Linegroup: <div class="linegroup"> The <div class="linegroup"> tag is used to preserve the formatting of text grouped into line sets. The <p class="line"> tag is used to wrap the individual lines within the linegroup. The inclusion of a <span> element is required when line numbering is present. The attribute class="linenum" must be applied to the <span> element containing the number content. One example of books requiring this type of markup is language textbooks.

AndersEkl commented 3 years ago

Ah, right... <p> was used for each line. That means you had to put the line number inside the <p> element. I always had an issue with <p> being used for something that often wasn't an actual paragraph. I think that is semantically incorrect. But, that is not the issue here... I don't know what I think here. Like I said, I see some advantages with having line numbers separate from the line itself. But I can't really say that one thing is correct and the other is incorrect.

martinpub commented 3 years ago

From DTBook DTD (2005-3):

<!ELEMENT line (%inline; | linenum)* >

I suggest then to stick to the DTBook-derived semantics, which I think is implied in the 2015-1 Nordic guidelines. Less confusion for colleagues as well as for external producers.

I think CSS is flexible enough to give good opportunities anyway. Or do you have a specific situation in mind @AndersEkl?

AndersEkl commented 3 years ago

It would still be inline in either options. But I guess it would be convenient to have something that resembles the DTBook markup. I'm fine with either, to be honest. I was just trying to see if there was anything we should consider here.

I was thinking of maybe having a styling so that the actual text lines have a straight left margin and the numbers sort of floating outside of it. Maybe you could still achieve that? Not sure.

kalaspuffar commented 3 years ago

Hi @AndersEkl and @martinpub

IMHO we should keep semantics and styling separate. Most things are achievable in css without changing mark-up. Only exception might be when you have two few elements.

In the specification the semantics of what we want to express is the important part and the styling is a separate issue.

Best regards Daniel

martinpub commented 3 years ago

I was thinking of maybe having a styling so that the actual text lines have a straight left margin and the numbers sort of floating outside of it. Maybe you could still achieve that? Not sure.

.linenum {
  float: left;
}

or something like that? I think a CSS ninja (that is, not me) could fix it quite elegantly.

AndersEkl commented 3 years ago

@kalaspuffar Yes, I would never let any styling considerations mess up semantics. In this case, however, the semantics are pretty much unspecified. But, let's go with Martin's gut feeling here. :)

egli commented 3 years ago

I'd also prefer if we stick to DTBook-derived semantics, i.e. linenum as part of line. The DAISY 3 Structure Guidelines have a section about Line Numbers where they give the following example:

<p>
  <line><linenum>1</linenum> There was a little girl</line>
  <line><linenum>2</linenum> Who had a little curl</line>
</p>

So, I think we should only allow example 1 and 3 above.

kalaspuffar commented 3 years ago

Small update so now we only support these two cases. All attributes are optional.

<div class="verse" id="test_1" title="This is a poem" xml:space="default" xml:lang="en" lang="en" dir="ltr">
    <p class="linegroup"><span class="line"><span class="linenum">1</span> This is line one</span></p>
    <p class="linegroup"><span class="line"><span class="linenum">2</span> This is line two</span></p>
    <p class="linegroup"><span class="line"><span class="linenum">3</span> This is line three</span></p>
    <p class="linegroup"><span class="line"><span class="linenum">4</span> This is line four</span></p>
</div>

<div class="line-by-line">
  <p class="linegroup"><span class="line"><span class="linenum">1</span> This is line one</span></p>
  <p class="linegroup"><span class="line"><span class="linenum">2</span> This is line two</span></p>
  <p class="linegroup"><span class="line"><span class="linenum">3</span> This is line three</span></p>
  <p class="linegroup"><span class="line"><span class="linenum">4</span> This is line four</span></p>
</div>
martinpub commented 3 years ago

Thanks for your input everyone! Will merge this now.