aisingapore / TagUI

Free RPA tool by AI Singapore
Apache License 2.0
5.65k stars 585 forks source link

SikuliX computer vision, OCR, low level keyboard mouse - KIV, less portable #928

Closed kensoh closed 3 years ago

kensoh commented 3 years ago

There is an existing issue https://github.com/kelaberetiv/TagUI/issues/916 on specific setup question on Linux for SikuliX. So I raised an issue over at Raimund's project https://github.com/RaiMan/SikuliX1/issues/417. Raimund has been the maintainer of SikuliX quite early on, where it was originally created at MIT.

TagUI comes packaged with SikuliX for the computer vision (OpenCV), OCR (Tesseract), low level keyboard and mouse capabilities. Without SikuliX project, TagUI project would not be so powerful. Raimund asked about how the integration works so I thought to create this issue to put all these info here.

Link to video going through topics below, and with demos - https://www.youtube.com/watch?v=_b9YYH-zYCY

kensoh commented 3 years ago

Flow chart of TagUI architecture

flowchart

TagUI's SikuliX Jython script https://github.com/kelaberetiv/TagUI/blob/master/src/tagui.sikuli/tagui.py

Entry point to invoke SikuliX process - MacOS/Linux and Windows https://github.com/kelaberetiv/TagUI/blob/5ab3f7c9e0f8ab2c917f09e6a711f636785c6c25/src/tagui#L292 https://github.com/kelaberetiv/TagUI/blob/5ab3f7c9e0f8ab2c917f09e6a711f636785c6c25/src/tagui.cmd#L954

Packaging for distribution with TagUI Following files come with TagUI packaged zips here - https://tagui.readthedocs.io/en/latest/setup.html

tagui/src/sikulix/LICENSE.txt
tagui/src/sikulix/sikulix.jar (v1.1.4)
tagui/src/sikulix/jython-standalone-2.7.1.jar

Credits to SikuliX on TagUI project page https://github.com/kelaberetiv/TagUI#credits

c

kensoh commented 3 years ago

Link to video going through topics below, and with demos - https://www.youtube.com/watch?v=_b9YYH-zYCY

kensoh commented 3 years ago

Adding on here and closing #905

OCR using read step can give strange looking characters like ï¬ and â. To check more why extended character set is part of the OCR engine in SikuliX, and perhaps use rule-based replacements for commonly occurring OCR text to map back to usual English characters.

For eg, below image -

ocr

is interpreted as follows -

Abstract. A purely peer-to-peer version of electronic cash would allow online
payments to be sent directly from one party to another without going through a
financial institution. Digital signatures provide part of the solution, but the main
beneï¬ts are lost if a trusted third party is still required to prevent double-spending.
We propose a solution to the double-spending problem using a peer-to-peer network.
The network timestamps transactions by hashing them into an ongoing chain of
hash-based proof-of-work, forming a record that cannot be changed without redoing
the proof-of-work. The longest chain not only serves as proof of the sequence of
events witnessed, but proof that it came from the largest pool of CPU power. As
long as a majority of CPU power is controlled by nodes that are not cooperating to
attack the network, theyâll generate the longest chain and outpace attackers. The
network itself requires minimal structure. Messages are broadcast on a best effort
basis, and nodes can leave and rejoin the network at will, accepting the longest
proof-of-woxk chain as proof of what happened while they were gone.
kensoh commented 3 years ago

The newer SikuliX has a main gain of using newer OpenCV and Tesseract - https://github.com/RaiMan/SikuliX1/wiki/About-actual-release-version

However, Tesseract is no longer supplied with macOS and requires additional setup - https://github.com/RaiMan/SikuliX1/wiki/macOS-Linux:-Support-libraries-for-Tess4J-Tesseract-4-OCR

In light of portability, will KIV for now migrating to newer SikuliX, for easier distribution.

If users have issue on OCR accuracy, will look further into the issues when they are raised.