w3c / alreq

Documenting gaps and requirements for support of Arabic and Persian on the Web and in eBooks.
Other
60 stars 31 forks source link

text/plain for RTL #257

Closed ebraminio closed 1 year ago

ebraminio commented 1 year ago

Hopefully not that out of scope for this repository, I had this idea for awhile and now that I just saw https://www.curiositry.com/blog.txt a blog just in a text file, somehow a novel way of using browser I guess, I was wondering what if there was someway to indicate directionality of plain text file on that system,

Right now for example now we have

data:text/plain;charset=utf8,سیب.

(paste the whole line in browser url bar)

which shows as

image

And if I drop charset=utf8 it will become

data:text/plain,سیب

image

And I wish there was a standardized way in browsers to specify direction in MIME also say like this,

data:text/plain;direction=rtl;charset=utf8,سیب.

or

data:text/plain;dir=rtl;charset=utf8,سیب.

so it can show it like this

image

(which currently shows it like the first screenshot and puts the dot in the wrong place)

So maybe suggesting dir=auto,rtl,ltr to MIME extension so RTL documents written in plain text can be shown correctly in the browser.

And maybe writing-mode={horizontal-tb,vertical-rl,vertical-lr,sideways-rl,sideways-lr} later also but vertically written language but guess that isn't that needed.

Or maybe the whole thing doesn't worth the hassle and that https://www.curiositry.com/blog.txt blogging mechanism is niche and should only be kept for LTR scripts, but at least it is discussed somewhere.

(and I know about right to left mark and other bidi control characters but my concern is the general layout, like applying dir="rtl" to <html> or <body> which flips the whole layout)

shervinafshar commented 1 year ago

I was wondering what if there was someway to indicate indicate directionality of plain text file.

There, technically, are ways to indicate directionality in plain text as explained in UAX #9 2.7 (and few previous sections): https://www.unicode.org/reports/tr9/#Markup_And_Formatting

But having said that, the premise of plain text presentation formats (Markdown et al) is ease of use for people who find mark-up too hard to use. So recommending people to add directional signifier characters might not have much teeth.

which shows as

I think this is your your browser (most likely, Chrome) being silly. Here's how it shows up in FF 104.0.1: https://user-images.githubusercontent.com/875962/188677333-38256fdd-d3d0-4d1c-9c0d-dd8b13aa5d10.png

ebraminio commented 1 year ago

Oh so this is thought about, so let's close this. Thanks :)

ebraminio commented 1 year ago

This is what happen in Firefox,

data:text/plain;charset=utf8,%D9%86%D9%85%D8%AA%D9%85%0A%D9%85%D9%86%D8%AA%D9%85%D8%AA%0A===%0A%D9%85%D9%86%D8%AA%D9%85%D9%86%D8%AA

image

I wished something more can be done here instead unicode marks.

shervinafshar commented 1 year ago

I'm interpreting your comment as "Why that === is aligned on left rather than right?". This is expected behavior per UAX#9; i.e. = (U+003D) is bidi neutral. Therefore, according to my reading of UAX#9 algorithm, it just falls back to default direction of the canvas which is left. It gets even sillier:

data:text/plain;charset=utf8,This %DB%8C%D8%B9%D9%86%DB%8C %D8%A7%DB%8C%D9%86%0A%0A%D8%AE%D8%B7 %D8%B3%D9%88%D9%85%0A===%0A%D8%AE%D8%B7 %DA%86%D9%87%D8%A7%D8%B1%D9%85

image

But this is an issue for this esoteric usage which tries to fall back to browser for correct formatting representation of RTL with almost zero directional context which renders the browser absolutely unfit for such usage. Sorry if that's not much of help for your enthusiasm.