aisingapore / TagUI

Free RPA tool by AI Singapore
Apache License 2.0
5.65k stars 585 forks source link

Hello, may I ask if the desktop application can extract data? Thank you! - yes can #839

Closed pythonLiNM closed 3 years ago

kensoh commented 3 years ago

Yes it is possible but not directly. The general method is you can use read step using OCR to read from a region on the screen, or you can double click on the data and then use keyboard step to copy the data onto clipboard, before accessing the data using clipboard() function. Can you share more on the automation scenario so that I can add more specific tips?

kensoh commented 3 years ago

Closing issue for time being, but please lemme know on any input or question.

darld commented 3 years ago

click 1.png read (300,300)-(700,700) to region region = clipboard() show region

I am trying to ocr read a region of text and show it on the terminal. Is my step correct?

kensoh commented 3 years ago

Hi @darld you can do something like below, after reading the OCR text to region variable, you can use echo step -

click 1.png
read (300,300)-(700,700) to region
echo `region`
darld commented 3 years ago

read (300,300)-(700,700) to region my step is freeze (more than 10 minutes) when doing ocr, but I heard the gpu noise when ocr running. I already install SikuliX and opencv. May I know what's going on?

kensoh commented 3 years ago

Are you running on Linux OS? Can you check the tagui.log file in tagui\src\tagui.sikuli, to see if there's any error from the OCR provided by the SikuliX engine?

darld commented 3 years ago

[tagui] START - listening for inputs

[tagui] INPUT - [1] read (160,200)-(380,300) to property_address [tagui] ACTION - read (160,200)-(380,300) to property_address [error] script [ tagui ] stopped with error in line 582 [error] java.lang.UnsatisfiedLinkError ( java.lang.UnsatisfiedLinkError: Error looking up function 'TessPDFRendererCreateTextonly': /usr/lib/x86_64-linux-gnu/libtesseract.so.4.0.0: undefined symbol: TessPDFRendererCreateTextonly ) [error] --- Traceback --- error source first line: module ( function ) statement 354: main ( text_read ) temp_text = region_layer.text() 323: main ( read_intent ) return text_read(raw_intent) 520: main ( parse_intent ) return read_intent(script_line) [error] --- Traceback --- end -------------- IDE terminated: returned: 1

Yes, I am running on ubuntu 18.04.

darld commented 3 years ago

refer to the https://github.com/kelaberetiv/TagUI/issues/783 I managed to get the ocr work. OS: ubuntu 18.04.5 Prepare a short document for the user who have the problem 1, download the sikulix.jar from https://launchpad.net/sikuli/sikulix/2.0.4/+download/sikulixapi-2.0.4.jar 2, mv to folder:tagui/src/sikulix and replace the ori "sikulix.jar" 3, if you haven't install the tesseract-ocr, it will auto pop out a webpage https://github.com/RaiMan/SikuliX1/wiki/macOS-Linux:-Support-libraries-for-Tess4J-Tesseract-4-OCR, and you need to follow it to install tesseract 4, sudo apt-get install python3-opencv

@kensoh correct me if something is wrong

kensoh commented 3 years ago

Hi @darld your steps above are for newer SikuliX, can you try below to see if it works? From your error log above, it looks like the SikuliX engine and the OCR engine is not working correctly on your computer.

You can try following below steps for the SikuliX version that TagUI is using - https://sikulix-2014.readthedocs.io/en/latest/newslinux.html#version-1-1-4-special-for-linux-people

Also, replace sikulix.jar from attached version which TagUI is developed to run on - https://github.com/kelaberetiv/TagUI/releases/download/v6.0.0/TagUI_Linux.zip