sbsdev / pipeline-mod-sbs

SBS specific modules for the DAISY Pipeline 2
0 stars 0 forks source link

More control over translation through CSS #38

Closed bertfrees closed 7 years ago

bertfrees commented 8 years ago

See requirement [4.3:45] The user shall have fine-grained control over the translation (on element-level) by means of configuration in those cases where markup of the input is inadequate or undesirable.

More information needed from @mixa72. What does this mean in practice? It is already possible to attach CSS styles to inline elements, notably with the "text-transform" property. However only a few values are supported at the moment. Without specific requirements we can't define or implement new values. Also, for each of the values we define we need to think about how custom styling (CSS) and default behavior (currently in XSLT) will interact.

mixa72 commented 8 years ago

Currently, we have 2 possibilities to render text in a lower contraction grade than the surrounding text (downgrade):

1) INLINE - the span element + brl:grade attribute: text inside this element is rendered according to the value of brl:grade ("0", "1"). The corresponding indicators at the beginning and end are inserted if the surrounding text is in grade 2 (Kurzschrift).

Example:

Ein Sprichwort: <span
brl:grade="0">Tout vient
à point à qui sait attendre</span>
(Zu dem, der warten kann,
kommt alles mit der Zeit).

Output (grade 0 inside grade 2):

 6 SPR#W?T: -.TOUT VIENT "[
 POINT "[ QUI SAIT ATTENDRE'.
 =Z [, R W)TC K, KXT A% T R
 ZT=. 

2) BLOCK - the blockquote, div, epigraph and poem element + xml:lang attribute with a value other than "de": text inside these elements is rendered with grade 0 (Basisschrift). The corresponding indicators at the beginning and end are inserted if the surrounding text is in grade 2 (Kurzschrift).

Example:

<p>... schrieb er folgendes Gedicht:</p>
<poem xml:lang="en">
  <line>If you want to get a favour done</line>
  <line>By some obliging friend,</line>
  <line>And want a promise, safe and sure,</line>
  <line>On which you may depend,</line>
  <line>Don't go to him who always has</line>
  <line>Much leisure time to plan,</line>
  <line>If you want your favour done,</line>
  <line>Just ask the busy man.</line>
</poem>

Output (grade 0 inside grade 2):

   ... ,5 7 FGCD% &D#T:

   -.IF YOU WANT TO GET A
     FAVOUR DONE
   BY SOME OBLIGING FRIEND,
   AND WANT A PROMISE, SAFE
     AND SURE,
   ON WHICH YOU MAY DEPEND,
   DON'T GO TO HIM WHO ALWAYS
     HAS
   MUCH LEISURE TIME TO PLAN,
   IF YOU WANT YOUR FAVOUR
     DONE,
   JUST ASK THE BUSY MAN.'.     

There's also the possibility to use a brl:literal element + brl:grade attribute inside a brl:select element, but this construct is very clumsy and should only be considered as a workaround solution.

Note: The above mentioned strategy to use grade 0 in general when a language other than German is involved (block elements) is now outdated and would need some refinement (e.g. we currently use grade 1 not 0 to render extracts in Swiss German and other German dialects; depending on the words/book genre even French and English extracts are rendered in grade 1 sometimes). Therefore, we need a more flexible solution in the future, allowing for the free choice of

This choice should not be restricted by the language. A css-based solution, possibly in combination with the pseudoelements ::before/::after and the text-transform property, would be welcome, so the transcribers could also re-define/omit the indicators individually (a feature which could be used for many books with multilingual content).

bertfrees commented 8 years ago

I think the best way to go about this is to implement a new more generic and flexible solution, that allows you to easily implement the current behavior (brl:grade and xml:lang attributes) yourself through configuration, but also do ad-hoc customizations if needed.

You have to give me some feedback. Could it be simpler? For example, maybe just two top-level options for the opening and closing indicators are enough? Or is this not generic/flexible enough? There are no "technical" limitations. Anything is possible, as long as the behavior can be properly defined, in a logical way, including all corner cases (think of nested elements with text-transforms, adjacent siblings with the same or different text-transforms, etc.).

mixa72 commented 8 years ago

Thank you for these great proposals. You made some interesting points. I'll briefly go through them and let you know my thoughts.

Introducing new CSS styles to override the contraction grade seems an excellent solution to me (see your first point). With the CSS snippet you proposed we mimic the status quo and are free to modify the behavior in the future. Regarding the terminology I have no idea what's best, but I tend to a dedicated property "-sbs-grade" since "text-transform" is also used with strings/variables if I remember right and might therefore create some confusion (unfortunately, I couldn't check that, has the braille CSS spec site been moved to somewhere else? It's not available anymore on http://snaekobbi.github.io/braille-css-spec/master/index.html). As for the values I'd prefer "grade-0", "grade-1" over "basisschrift", "vollschrift" for the sake of consistency (apart from the WebUI, this information generally appears in numeric form). With a dedicated property "-sbs-grade" a simple number "0" or "1" would certainly be sufficient as well.

Regarding the indicators I'd rather avoid to define them as top-level options and prefer specifying them in the -sbs-grade property (-sbs-grade: 'grade' 'open-downgrade' 'close-downgrade'). As for the condition, I assumed indeed that a simple "if $contraction-grade == 2 { ... }" would suffice, but I fully understand your doubts about unpredictable behavior. You are also right about the fact that we basically want to keep the status quo: "only use indicators if the surrounding text is in grade 2". However, it would give us a lot more flexibility if this condition were handled in the CSS where everybody could access it. But as you mentioned nested elements, there is one thing that still isn't clear for me with this approach:

if we have in the CSS

@if $contraction-grade == 2 {

  span[brl|grade='0'] {
    -sbs-grade: 0 "-." "'.";
  }
  span[brl|grade='1'] {
   -sbs-grade: 1 "-." "'.";
  }
  blockquote[xml|lang]:not([xml:lang=de]),
  div[xml|lang]:not([xml:lang=de]),
  epigraph[xml|lang]:not([xml:lang=de]),
  poem[xml|lang]:not([xml:lang=de]) {
   -sbs-grade: 0 "-." "'.";
  }
}
@else {
  span[brl|grade='0'] {
    -sbs-grade: 0 "" "";
  }
    span[brl|grade='1'] {
   -sbs-grade: 1 "" "";
  }
  blockquote[xml|lang]:not([xml:lang=de]),
  div[xml|lang]:not([xml:lang=de]),
  epigraph[xml|lang]:not([xml:lang=de]),
  poem[xml|lang]:not([xml:lang=de]) {
   -sbs-grade: 0 "" "";
  }
}

and the xml looks as follows

<div xml:lang="en">
  <div><p>bla</p></div>
  <div><p>bla</p></div>
  <div><p>bla</p></div>
</div>

will the indicators be applied inside each of the inner divs because they inherit the xml:lang from the outer div (= erroneous behavior) or only once inside the outer div (= correct behavior)?

If the behavior is as expected, we should give it a try perhaps. I can't imagine other corner cases at the moment.

As for the pseudo-elements I see now the difficulties, so let's drop them.

There is one more thing I forgot to mention: there are not only 2 but 3 indicators to be used: 1) opening for single words '. 2) opening for multiple words -. 3) closing for multiple words '.

So, the dedicated property -sbs-grade: 'grade' 'open-downgrade' 'close-downgrade' should either be extended to -sbs-grade: 'grade' 'open-downgrade-single' 'open-downgrade' 'close-downgrade' or, maybe not very elegant, use the closing indicator for multiple as the opening indicator for single words (because they're identical by coincidence). I hope this additional fact won't conflict in any way with your great proposals. I could also provide you with more material or tests if needed.

bertfrees commented 8 years ago

unfortunately, I couldn't check that, has the braille CSS spec site been moved to somewhere else?

I have moved the specification to http://braillespecs.github.io/braille-css

bertfrees commented 8 years ago

Note that in case of a "smart" property -sbs-grade: <grade> <open-downgrade> <close-downgrade> ("smart" because it automatically detects a downgrade) you wouldn't need to do anything else, whereas with a "dumb" property -sbs-grade: <grade> <open> <close> you would have to do more advanced things like in your example CSS, because the open and close indicators would be inserted no matter what.

The benefit of a dumb property is that we don't make things unnecessarily complicated if there's no need for it.

With "nested elements", I meant for example:

<div xml:lang="en">
  bla <span brl:grade="1">bla</span>
</div>

It is obvious that if you want to handle this with a dumb property you need an extra exception rule that prevents indicators to be inserted before and after the span.

With "adjacent siblings" I meant for example:

<div xml:lang="en">
  bla
</div>
<div xml:lang="en">
  bla
</div>

Do both divs get open and close indicators or should the two be treated as a single blob?

In case we go for a "smart" property, it is important to precisely define what we mean by "surrounding text".

You can think of more exotic examples where elements, for which you have defined different grades and/or indicators, interact (appear directly after of inside each other).

bertfrees commented 8 years ago

Regarding your proposal to have more arguments to specify the downgrade-indication: I don't see any problem with that. We can have a fallback mechanism where if you omit a certain argument it takes the same value as another argument, or defaults to a fixed value.

It's quite common to do that in CSS, think for example of the "border-style" property. Quoting the CSS spec:

‘border-style’ is a shorthand for the other four (border-top-style, border-right-style, border-bottom-style and border-left-style). Its four values set the top, right, bottom and left border respectively. A missing left is the same as right, a missing bottom is the same as top, and a missing right is also the same as top.

You could do something similar with the indicators:

* gets a new idea

I suddenly realize that there is a striking parallel between these arguments and a new Liblouis feature, namely "custom emphasis classes". I'm not saying we have to implement it with Liblouis, but I thought it would be worth noting because it can inspire us in various ways. For example we can have a similar way of configuring, this would improve compatibility and would make it easy to switch the implementation from XSLT to Liblouis later.

In Liblouis you can define a custom emphasis style, for example:

emphclass downgrade
begemphword downgrade '.
begemphphrase downgrade -.
endemphphrase downgrade '.

The emphasis can then be applied to a text segment by attaching a number that corresponds with the class to each character in the text segment.

A couple of ideas:

CC @egli

mixa72 commented 8 years ago

Thanks for your explanations. I guess I know what you mean now. As for nested elements:

<div xml:lang="en">
  bla <span brl:grade="1">bla</span>
</div>

I'd rather consider this case an "upgrading" inside a "downgrading" since in the div you downgrade to grade 0 through xml:lang="en" and inside the span, you upgrade to grade 1. I can't think of a possible use case in practice: the direction is always 2 -> 1, 2 -> 0, 1 -> 0 never the other way round (an absolute corner case is possibly a braille textbook about how to learn grade 2 when you already know grade 1, then you would need an upgrading, but without indicators). Let's also imagine a downgrading inside a downgrading i.e. 2 -> 1 -> 0 for which the code would look as follows (this case could occur, when you downgrade an extract in dialect to grade 1, but one word is hardly readable even in grade 1 so the transcriber decides to downgrade to grade 0):

<div brl:grade="1>
  bla <span brl:grade="0">bla</span>
</div>

In this case the outer downgrading (2 -> 1) requires indicators whereas the inner downgrading (1 -> 0) doesn't.

So, for nested elements we can formulate the rule: never use indicators.

As for adjacent siblings:

<div xml:lang="en">
  bla
</div>
<div xml:lang="en">
  bla
</div>

These two divs should be treated as a single blob

So, for adjacent siblings we can formulate the rule: never repeat the indicators; use an opening indicator before the first and a closing indicator after the last sibling.

That's all I can say for the moment, but I will find out if there are more exceptions.

bertfrees commented 8 years ago

The part of the indicators is done. What we can do next is make the actual contraction grade fully configurable with CSS too.

mixa72 commented 8 years ago

1) There are hyphenated words in the downgraded text although I disabled hyphenation. It's possibly due to the fact that the requirement http://snaekobbi.github.io/requirements/master/index.xhtml#4.3:43 hasn't been implemented yet. 2) It appears that the mapping of the CSS properties "single-word" and "close" is mixed up. This can be noticed when two different braille patterns are defined (by default, they have the same pattern)

bertfrees commented 8 years ago

1) But 4.3:43 is implemented. We even fixed a bug for it recently: https://github.com/sbsdev/pipeline-mod-sbs/issues/13. According to my tests it worked fine. Can I see your test?

2) You're right. It was a bug. Actually the "close" property was completely ignored. But because if no "close" is defined it falls back to "single-word", it worked anyway. Do you think I should disable the fallback, in order to make thing more clear?

mixa72 commented 8 years ago

2) Yes, it's probably better if you disable the fallback then. BTW: Is it possible to also use unicode braille instead of ascii braille for the properties "single-word", "open" and "close"? I just noticed that we generally use unicode braille for the properties "content", "border-left", "list-style-type", etc. and thought it would make sense to handle this in a uniform way.

bertfrees commented 8 years ago

OK. Yes that should also be possible as soon as https://github.com/sbsdev/pipeline-mod-sbs/issues/35 is redeployed (because brl:literal is used internally).

bertfrees commented 8 years ago

It appeared that 1. was simply because the "hyphenation" option doesn't work yet (issue #10).

mixa72 commented 8 years ago

OK, then I'll wait with closing until issue #10 works.

bertfrees commented 8 years ago

I have fixed it, it is testable as soon as this issue (38) is set to "deployed".

mixa72 commented 7 years ago

Hyphenation point is still there. Will test later on.

bertfrees commented 7 years ago

@mixa72 I don't know what to do. I'm running this file that I got from you:

https://github.com/sbsdev/pipeline-mod-sbs/blob/6738c45f7614b6952123f50d7c7367d65fbb3a68/src/test/resources/test_downgrade_contracted_braille/test_downgrade_contracted_braille.xml

with hyphenation disabled:

https://github.com/sbsdev/pipeline-mod-sbs/blob/6738c45f7614b6952123f50d7c7367d65fbb3a68/src/test/xprocspec/test_dtbook-to-pef.xprocspec#L757

and I don't get any hyphenation points in the output. Are you running a new test?

mixa72 commented 7 years ago

Sorry, that was my fault. I failed to disable hyphenation in the top level settings assuming it was disabled by default for poems in the scss. Behavior is as expected now!

bertfrees commented 7 years ago

OK good. We can disable hyphenation for poems if you want.

mixa72 commented 7 years ago

No, that's not necessary at the moment. In the current system, hyphenation is enabled also for poems by default and the transcriber individually decides to keep or remove the hyphens depending on language and contraction grade. Should this policy change in the future, we will simply adjust the scss accordingly.

bertfrees commented 7 years ago

Moved upstream: https://github.com/daisy/pipeline-mod-braille/compare/0e7b136d27f42095066a6a8254c5b38c23a98a45...a25ae7d5cc0b5c8029d531565109c788202a895b