aisingapore / TagUI

Free RPA tool by AI Singapore
Apache License 2.0
5.58k stars 580 forks source link

Data Entry from XML and PDF A/3 - may require using Python XML Parser #682

Closed mbsryu closed 4 years ago

mbsryu commented 4 years ago

I've been thinking about using TagUI to automate data entry into a legacy system (AS400/DB2) with a Windows GUI client front-end. The data that needs entering is presented as XML files generated by a data entry front-end or a DMS.

Despite TagUI being described as a web and desktop automation tool, would I be correct in understanding that the current iteration is already capable of taking an XML file, locating data in it using Xpath, and then translating them to keyboard entries into GUI fields?

I realise the traditional/better way to do this would be via back-end integration but for many reasons (cost/resources/changing requirements, etc) this is difficult.


On a separate but related note, I've recently been involved in writing a specification document (using PDF a/3 with embedded XML metatdata) for data interchange from external systems into a DMS used by a government agency. Being able to read XML attachments in a PDF A/3 files (as in ZUGFeRD) would be an interesting additional twist...

Many systems, such as those used for invoicing, data extraction, and document management use XML output as a means of providing open standards based data interchange options. EDI is another more traditional example. More recent examples involve hybrid interchange files being adopted as national standards for electronic invoicing. ZUGFeRD is one such example from German and French governments. ZUGFeRD uses XML attachments in PDF A/3 files to transmit invoicing data between entities so that the documents are both machine and human readable.

Support for specific variants of data interchange standards would be impractical and unnecessary but explicitly extending the use case scenarios to include reading from XML might be an idea. This is especially the case as document and invoice processing are areas where deployment of RPA tools would be particularly attractive.

kensoh commented 4 years ago

Thanks for raising this interesting scenario and sharing the context!

Despite TagUI being described as a web and desktop automation tool, would I be correct in understanding that the current iteration is already capable of taking an XML file, locating data in it using Xpath, and then translating them to keyboard entries into GUI fields?

For above, I've not tried loading XML as a webpage and access via XPath using TagUI. The design looks for a webpage URL to load (https:// and http://), and perhaps localhost. But it does not have loading by file:// so not sure if that is workable out of the box. Even if a hack is done to allow loading file://, the XPath is tested with webpages and has not been tested with XML files.

The JS engine is PhantomJS so there isn't an XMLParser in its API. And because the engine is PhanomtJS and not Node.js, access to XML parsing packages is not possible right now. One round about way is to use the Python integration to process the XML using a Python XML parsing package - https://github.com/kelaberetiv/TagUI#python-integration

Or if your scenario is quite straightforward and you are comfortable with Python, you can use a side project I did that has the UI automation features of TagUI to use Python XML parser directly - https://github.com/tebelorg/TagUI-Python

For entry into the legacy terminal app, a combination of visual automation, keyboard combinations and keyboard entries may be able to do the work.

kensoh commented 4 years ago

Lastly, in pure JavaScript, writing an XMLParser from scratch should be doable, but that is probably too much work if there is an easier alternative.

mbsryu commented 4 years ago

I just tried running this Flow:

//XML Flow keyboard [ctrl]a keyboard [delete] keyboard file:///C:/tagui/thfx.xml keyboard [enter] //read text read CaseNo to ttxt echo ttxt

In both Phantom and Chrome headless modes , as you noted, the error appears to be that it cannot open the file with the error:

"C:\tagui\src>^Afile:C:/tagui/thfx.xml
The filename, directory name, or volume label syntax is incorrect."

The flow works in visible Chrome mode. Where the read step is referencing the item "CaseNo" but not as Xpath.

kensoh commented 4 years ago

I see.. Interesting!

mbsryu commented 4 years ago

I find it curious that Chrome Headless is behaving differently from Chrome visible... Is it expected that Xpath would not work on Chrome visible mode?

kensoh commented 4 years ago

Yes headless Chrome is not 100% similar in behaviour to visible Chrome. There should be some differences other than not rendering a browser window.

XPath works in visible Chrome, which is why for different TagUI steps you can use XPath to access the web elements.

mbsryu commented 4 years ago

So I've managed to get a prototype running that does the following:

  1. Opens an XML file in Chrome (visble),
  2. extracts data from it using read steps to variables,
  3. opens the target application,
  4. uses Visual Automation to find the fields in the GUI and
  5. type steps to enter the values.

Observations: a. Selection of visual elements and the simulated mouse pointer movements feel slow. I could probably use keyboard shortcuts but would this speed things up?

b. Likewise the entry of text and/or variable values into the fields also appear to be slow. Is this meant to mimic human key entry? Is there a way to speed this up?

c. Is Phantom.JS (going to be ) deprecated? You mentioned Node.JS above. How would this affect things in the future?

d. As mentioned in my first post in this thread, (continued/expanded) support for xml in this manner would be attractive if it doesn't require too much heavy lifting.

Flow Any suggestions would be much appreciated.

`//XML Flow. Open with Chrome visible option.

//Using keys to enter file name/path into the browser keyboard [ctrl]a keyboard [delete] keyboard file:C:/tagui/thfx.xml keyboard [enter]

//Reading items in the file read Tax_Year to taxyear read TIN01 to tin echo "Tax Year: "+taxyear echo "TIN: " +tin read Name01 to name echo "Name: " +name read S01C01L01 to S01C01L01 echo "Employer: " +S01C01L01 read S01C01L02 to S01C01L02 echo "Income: " +S01C01L02 read S01C01L03 to S01C01L03 echo "Director: " +S01C01L03 read S01C01L04 to S01C01L04 echo "Dir Fee: " +S01C01L04 read S01C01L05 to S01C01L05 echo "Other: " +S01C01L05 read S01C01L06 to S01C01L06 echo "Other Fee: " +S01C01L06

//Opening target application (fmp12 used for prototype) keyboard [win] keyboard RPA_Demo_FM.fmp12 keyboard [down] keyboard [down] keyboard [down] keyboard [enter] wait 2

//Entering data into the target app click C:\tagui\TAX_RPA_Demo\rpademoimg\fmnew.png type C:\tagui\TAX_RPA_Demo\rpademoimg\fmTaxYear.png as taxyear type C:\tagui\TAX_RPA_Demo\rpademoimg\fmTIN01.png as tin type C:\tagui\TAX_RPA_Demo\rpademoimg\fmName01.png as name type C:\tagui\TAX_RPA_Demo\rpademoimg\S01C01L01.png as S01C01L01 type C:\tagui\TAX_RPA_Demo\rpademoimg\S01C01L02.png as S01C01L02 type C:\tagui\TAX_RPA_Demo\rpademoimg\S01C01L03.png as S01C01L03 type C:\tagui\TAX_RPA_Demo\rpademoimg\S01C01L04.png as S01C01L04 type C:\tagui\TAX_RPA_Demo\rpademoimg\S01C01L05.png as S01C01L05 type C:\tagui\TAX_RPA_Demo\rpademoimg\S01C01L06.png as S01C01L06 keyboard [enter]

`

kensoh commented 4 years ago

Very nice, thanks for sharing!

a. Selection of visual elements and the simulated mouse pointer movements feel slow. I could probably use keyboard shortcuts but would this speed things up?

Yes you can use keyboard shortcuts, and modify the scan_period = 0.5 in tagui\src\tagui.sikuli\tagui.py file. default communication speed from TagUI to SikuliX is 0.5 seconds but can be reduced.

b. Likewise the entry of text and/or variable values into the fields also appear to be slow. Is this meant to mimic human key entry? Is there a way to speed this up?

You can use js clipboard('some long text') to copy long text to clipboard and then use keyboard step to paste to speed up typing.

c. Is Phantom.JS (going to be ) deprecated? You mentioned Node.JS above. How would this affect things in the future?

PJS maintainer has recently started actively developing the project. Things should be transparent to users if migrate to Node.js. However, in v6 we are making a number of breaking changes to improve user experience. That will impact users. I'll post a new issue on the list of changes.

d. As mentioned in my first post in this thread, (continued/expanded) support for xml in this manner would be attractive if it doesn't require too much heavy lifting.

Not easy to implement now in current architecture, need to write parser from scratch. After migrate to Node.js then may be viable by leveraging on npm xml parsing package.

kensoh commented 4 years ago

CC @siowyisheng

kensoh commented 4 years ago

Closing issue for now till further inputs

mbsryu commented 4 years ago

I'm having a little difficulty in using the js clipboard('some long text') suggestion you mentioned above.

I want to do something along the lines below:

read Name01 to name
js clipboard(name) 
keyboard [tab]
keyboard [ctrl]v

Is it possible to use variables in place of text?

kensoh commented 4 years ago

Yes it's possible. Something like below works on Mac (make sure to set Chrome at foreground so that pasting gets pasted on the browser and not into the OS or some other window)

https://ca.yahoo.com
type search-box as github
read search-box to query
js clipboard(query)
click chrome_icon.png
keyboard [cmd]v