daisy / ebraille

Repository for developing use cases and standard for digital braille
16 stars 5 forks source link

Use braille files produced by other organisations #2

Closed jrbowden closed 3 months ago

jrbowden commented 2 years ago

As a braille display user or braille organisation, I want to be able to read or emboss braille documents produced by organisations in different countries with the same or similar ease to reading or embossing documents produced in my own country, confident that the braille characters will appear correctly.

Detail: Assuming I know the braille code and written language of the document, I want to be able to make use of braille files produced from any organisation and read or emboss it, knowing that the braille characters will appear correctly.

Currently, different countries may use different correspondences between characters stored in the braille file to the dots they represent in braille. If the originating organisation and the receiving device use different encoding, the file is effectively unusable.

Proposal:

The characters stored in braille files should be standardised across the world so that organisations all use the same encoding to represent braille in braille files. One possibility for this would be using the Unicode braille characters U+2800-U+283f (for six dot braille) or U+2800-U+28ff for eight-dot braille).

franciscoONCE commented 2 years ago

I consider this to be of paramount importance. Can be mapped with the use of Unicode characters in PEF files. This might help to produce ready-to-emboss PEF files out of this new e-braille format just by providing the required specific embossing requirements in terms of page structure. This will also guarantee that the braille characters in the file will display in exactly the same way regardless of device, country, braille characters set, etc.

mattgarrish commented 1 year ago

This doesn't strike me as a technical problem for any of the publishing formats. It's about getting agreement on how to encode the braille text.

bertfrees commented 1 year ago

The character set used to represent the braille could even be specified in metadata, if we can come up with a standard way to do that.

Note that it will also have to be obvious whether the content is braille (pre-transcribed) or not, if both are allowed. This could for example be specified in dc:language metadata, or in xml:lang attributes (a BCP 47 language tag can express that the content is braille, see https://github.com/daisy/ebraille/issues/25#issuecomment-1273256658).

Menelion commented 1 year ago

What are y’all’s points against Unicode Braille? I see it now as the only sane way of encoding Braille independently of the language, character set or Braille table being used. All major players on Windows seem to support it (needs investigation for other operating systems).

jrbowden commented 1 year ago

Points against Unicode are that, due to the character numbers, it takes twice as much (or three times as much) storage space, depending on how the file is encoded (e.g. UTF16 or UTF8).

Let's take the representation of the braille characters ABC, dot 1, 12, 14.

In ASCII: 0x41 0x42 0x43 (total size, 3 bytes)

In Unicode, encoded as UTF16LE) 0x01 0x28 0x05 0x28 0x09 0x28 (total size 6 bytes)

In UTF8: 0xe2 0xa0 0x81 0xe2 0xa0 0x85 0xe2 0xa0 0x89 (total size 9 bytes)

So this quickly magnifies the size of the storage requirements.

ASCII encoding is the most concise, so long as the braille table is clearly understood - but that braille table is the sticking point.

It is worth noting that the fonts supplied as standard in Windows may not display the braille space with the correct width.

So, Unicode is a good option, but it is, at this stage, one option. Hope this helps.

Menelion commented 1 year ago

@jrbowden Thank you, those are valid points. As a Unicode advocate I'd still want to drive the discussion further.

First, isn't space super cheap now? Like, no one measures in megabytes anymore, it's 8 gigs minimum, and it's for an SD card.
Second, the point about Windows fonts is the most valid, but can't we standardize a font that is to be supplied with the according software? Or, if there is a built-in font that displays the space correctly, impose its use in metadata or so?

ManfredMuchenberger commented 1 year ago

I am quite new to processes that contain such large discussion groups. So you'll have to forgive me if it's not usual to point this out. But I don't see any new arguments here so far that make it necessary to come back to our decision from the meeting of 2022-11-29 to use Unicode Braille characters. Link to meeting notes: https://github.com/daisy/ebraille/wiki/eBraille-Working-Group-Meeting-notes

jrbowden commented 1 year ago

Yes, space is cheap in mainstream, full-function, computers and tablets, but remember that braille displays often do not have those memory capacities. Just something that has to be taken into account for hardware manufacturers. Space is not the only thing - three times as much space also implies more processor power to trek over the extra space. Please note, I am not pushing either approach, just pointing out pros and cons. Hope this helps.

wfree-aph commented 3 months ago

Since the first draft of the specification specifies Unicode braille and a uniform packaging format, the requirements of this feature are satisfied. Closing this issue as resolved.