Closed SafaAlfulaij closed 4 years ago
On Jul 22, 2020, at 22:24, Safa Alfulaij notifications@github.com wrote:
https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/ubidi_8h.html It's massive, but maybe?
Indeed, but it could be done. If you had python wrappers around this API, how would you use it ? In particular, it would help me understand how this API is used. Could you please propose some python pseudo-code using this API that I could then reuse for writing tests or an example ?
Thanks ! Andi..
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.
Sorry for the late reply.
I believe we can use something similar to what node-icu-bidi does.
layout = BidiLayout() # ubidi_open and init
layout.setPara("input text goes here") # ubidi_setPara, options at a later stage
level = layout.getParaLevel() # ubidi_getParaLevel
direction = layout.getDirection() # ubidi_getDirection
charLevels = []
for x in range(layout.getProcessedLength()): # ubidi_getProcessedLength
charLevels.append( layout.getLevelAt(x) ) # ubidi_getLevelAt
# Or charLevels = layout.getProcessedArray()
runs = []
for x in range(layout.countRuns()): # ubidi_countRuns
runs.append(layout.getVisualRun()) # ubidi_getVisualRun
This is the core functionality.
Great, thank you, this is useful !
On Jul 25, 2020, at 22:11, Safa Alfulaij notifications@github.com wrote:
Sorry for the late reply.
I believe we can use something similar to what node-icu-bidi does.
layout = BidiLayout() # ubidi_open and init layout.setPara("input text goes here") # ubidi_setPara, options at a later stage
level = layout.getParaLevel() # ubidi_getParaLevel direction = layout.getDirection() # ubidi_getDirection
charLevels = [] for x in range(layout.getProcessedLength()): # ubidi_getProcessedLength charLevels.append( layout.getLevelAt(x) ) # ubidi_getLevelAt
Or charLevels = layout.getProcessedArray()
runs = [] for x in range(layout.countRuns()): # ubidi_countRuns runs.append(layout.getVisualRun()) # ubidi_getVisualRun This is the core functionality.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.
I just checked in a first pass at this that covers the entire API except for the subclassing support. Here are the differences between what you proposed and what I ended up implementing:
layout = BidiLayout() # ubidi_open and init layout.setPara("input text goes here") # ubidi_setPara, options at a later stage
layout = Bidi() layout.setPara(UnicodeString("input text goes here")) // I made using UnicodeString a requirement here because the Bidi object // needs it for its entire lifespan and I didn't want to make repeated // copies of the para text. The layout Bidi instance maintains a reference // to it. You can access also it via layout.text. // Note that line = layout.setLine(start, limit) also has a text property // that shares the characters of layout's text. A line Bidi instance keeps // track of its parent and maintains a reference to it. You can access it // via line.parent.
level = layout.getParaLevel() # ubidi_getParaLevel direction = layout.getDirection() # ubidi_getDirection
// Unchanged.
charLevels = [] for x in range(layout.getProcessedLength()): # ubidi_getProcessedLength charLevels.append( layout.getLevelAt(x) ) # ubidi_getLevelAt
Or charLevels = layout.getProcessedArray()
levels = layout.getLevels() // levels is a tuple of 'processed length' level values. // You can also run the for loop but it's going to be vastly slower.
runs = [] for x in range(layout.countRuns()): # ubidi_countRuns runs.append(layout.getVisualRun()) # ubidi_getVisualRun
logicalStart, length, direction = layout.getVisualRun(x) // runs are returns as triples // You should be able to use layout.text[logicalStart:logicalStart+length] // to get the implied text.
It would be nice if you could propose some unit tests for this new code and the functionality covered by the API. There are a bunch of interesting functions in there. Please, take a look at the new bidi.cpp source file for all the API possibilities.
Thanks !
Andi..
You are awsome man! You are the C++-tar!
I will provide you with some unit tests once I take a good look on (and check) the API and the whole class.
Thanks a lot! You are truly awsome :)
On Tue, 28 Jul 2020, Safa Alfulaij wrote:
You are awsome man! You are the C++-tar!
I will provide you with some unit tests once I take a good look on (and check) the API and the whole class.
Thanks a lot! You are truly awsome :)
You're welcome ! I'm hoping you can propose some tests that use some "real" text that has several kinds of characters in them, not just plain placeholder ascii text. Thanks !
Andi..
-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/ovalhub/pyicu/issues/134#issuecomment-665255716
I believe this is a good start (for now).
# ====================================================================
# Copyright (c) 2020 Open Source Applications Foundation.
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included
# in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
# DEALINGS IN THE SOFTWARE.
# ====================================================================
#
import sys, os
from unittest import TestCase, main
from icu import *
input_text = "\u0643\u0627\u0646 \u0057\u0069\u006e\u0064\u006f\u0077\u0073 \u0031\u0030 \u0623\u0648\u0644 \u0646\u0638\u0627\u0645 \u062a\u0634\u063a\u064a\u0644 \u0628\u062a\u0642\u0646\u064a\u0629 \u0034\u004b\u002e"
input_text_2 = "\u0643\u0627\u0646\u062a \u004d\u0069\u0063\u0072\u006f\u0073\u006f\u0066\u0074 \u0057\u0069\u006e\u0064\u006f\u0077\u0073 \u0627\u0644\u0645\u0627\u0644\u0643 \u0627\u0644\u0623\u0648\u062d\u062f \u0644\u0639\u0644\u0627\u0645\u0629 \u0058\u0050\u002e"
class TestBidi(TestCase):
def testDefault(self):
Locale.setDefault(Locale.getUS())
layout = Bidi()
layout.setPara(UnicodeString(input_text))
self.assertTrue( layout.getParaLevel() == 1 )
self.assertTrue( layout.getDirection() == UBiDiDirection.MIXED )
self.assertTrue( layout.getLevels() == (1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 1) )
self.assertTrue( layout.getVisualRun(0) == (39, 1, 1) )
self.assertTrue( layout.getVisualRun(3) == (4, 10, 0) )
def testForcedDirection(self):
Locale.setDefault(Locale.getUS())
layout = Bidi()
layout.setPara(UnicodeString(input_text), 0)
self.assertTrue( layout.getParaLevel() == 0 )
self.assertTrue( layout.getDirection() == UBiDiDirection.MIXED )
self.assertTrue( layout.getLevels() == (1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 0, 0) )
self.assertTrue( layout.getVisualRun(0) == (0, 3, 1) )
self.assertTrue( layout.getVisualRun(3) == (15, 22, 1) )
def testLines(self):
Locale.setDefault(Locale.getUS())
layout = Bidi()
layout.setPara(UnicodeString(input_text_2))
line_layout = layout.setLine(0, 15)
self.assertTrue( line_layout.getLevels() == (1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1) )
self.assertTrue( line_layout.getVisualRun(0) == (14, 1, 1) )
self.assertTrue( line_layout.getVisualRun(2) == (0, 5, 1) )
line_layout_2 = layout.setLine(15, 47)
self.assertTrue( line_layout_2.getLevels() == (2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 1) )
self.assertTrue( line_layout_2.getVisualRun(0) == (31, 1, 1) )
self.assertTrue( line_layout_2.getVisualRun(2) == (7, 22, 1) )
if __name__ == "__main__":
main()
And I tried using writeReverse
but couldn't. Checking the code I need to pass a str argument that will be discarded?
I tried this after setPara
at testLines
:
print(layout.writeReverse('', Bidi.DO_MIRRORING))
with no result...
On Thu, 30 Jul 2020, Safa Alfulaij wrote:
I believe this is a good start (for now).
... snip ...
Thank you, I integrated it into PyICU (not yet checked in).
And I tried using
writeReverse
but couldn't. Checking the code I need to pass a str argument that will be discarded?
writeReverse is a type method on the Bidi type, you can call it like that:
from icu import * Bidi.writeReverse("this is text") <UnicodeString: 'txet si siht'>
As a class method it doesn't need an instance or a para. It takes any kind of string, UnicodeString, python string, whatever that can be converted into a UnicodeString (like most PyICU wrappers taking a string)
I tried this after
setPara
attestLines
:print(layout.writeReverse('', Bidi.DO_MIRRORING))
with no result...
The inverse of an empty string might be just that.
Andi..
-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/ovalhub/pyicu/issues/134#issuecomment-666642757
I now added and checked in your unit tests for the new Bidi wrappers. Thank you for the contribution !
On Fri, 31 Jul 2020, Safa Alfulaij wrote:
Closed #134.
-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/ovalhub/pyicu/issues/134#event-3607614869
On Fri, 31 Jul 2020, Safa Alfulaij wrote:
Closed #134.
I also now added wrappers for the UBiDiTransform API.
-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/ovalhub/pyicu/issues/134#event-3607614869
https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/ubidi_8h.html It's massive, but maybe?