tebelorg / RPA-Python

Python package for doing RPA
Apache License 2.0
4.93k stars 666 forks source link

Use in air-gap environments with network restriction and no internet [done] #36

Closed ycx91 closed 4 years ago

ycx91 commented 5 years ago

Now I understand that from tagui.py that it is downloading from

def setup():
    """function to setup TagUI to user home folder on Linux / macOS / Windows"""

    # get user home folder location to setup tagui
    if platform.system() == 'Windows':
        home_directory = os.environ['APPDATA']

Would it be good to change the code such that it does a check on whether the tagui folder exists and uses the existing folder rather than download a new one?

I ask this because I'm working in an environment where remote host downloading is not allowed, (yay web isolation, no curl either). So by inspecting the code, I know that this folder is required and I've downloaded manually from https://github.com/tebelorg/Tump/releases/download/v1.0.0/TagUI_Windows.zip

kensoh commented 5 years ago

Thanks for sharing this use scenario!

Actually there is already checking if there is an existing installation in init(), and setup() would be once called to download TagUI only if an existing installation is not found.

# if tagui executable is not found, initiate setup() to install tagui
if not os.path.isfile(tagui_executable):

However, besides that, _tagui_delta() will be called to sync updated TagUI files through the internet from a GitHub https:// endpoint, for the particular TagUI for Python version. This will be called once whenever you install a new version of TagUI for Python.

For Windows, there is also automated downloading of vcredist_x86.exe, a supporting library from Microsoft, if it is not already installed on the computer. For macOS, there is automated installation of Python 3 SSL certs and OpenSSL.

I'm very interested to support usage of TagUI for Python on air-gapped environments or local installations without internet access. I'll think through over it what changes to the package can be made to reduce user friction in using the package in such environments.

In the meantime, do post here more of the restrictions, constraints and what is possible in your implementation environment. So that I can figure out which areas can be optimised, without increasing user friction for the primary use case of most users with internet access on their laptops.

ycx91 commented 5 years ago

Hi Ken, I've dropped you a message on LinkedIn regarding further collaboration. Ideally speaking, we should be able to package the dependency files in a dump place which also allows manual setup/downloads. So that when init and setup hits the error, we can prompt the user to perform the installation manually. Would this be possible in the current setup or can there be a better approach?

bfgoh commented 5 years ago

Hi Ken and ycx91. I have actually message Ken on LinkedIn too, regarding installing TagUI that are completely sealed off from the internet. I was thinking if it is also possible to install TagUI at %temp% of Windows operating system. As currently, I was thinking of implementing RPA on some of the repeated operations in my company. However, due to strict IT policy, I was unable to introduce TagUI to my company.

kensoh commented 5 years ago

The manual way will be for users to -

  1. download and unzip TagUI zip file for their OS https://github.com/tebelorg/Tump/releases
  2. download delta files from https://github.com/tebelorg/Tump/tree/master/TagUI-Python (delta files are those that have remarks 'stable delta files', including tagui.sikuli/tagui.py)
  3. download Java JDK for their OS from Oracle website and Windows PHP dependency
  4. put the files to folder %APPDATA%\tagui for Windows, ~/.tagui for macOS / Linux
  5. for macOS / Linux, chmod -R 755 on the ./tagui folder for execute permissions
  6. create a tagui_python_version file in above folder, eg tagui_python_1.9.1 for tracking delta
  7. copy tagui.py and license file to the folder where TagUI for Python will be used
  8. import tagui as t; t.init(); t.setup(); should work without problems

The question will be how much of above steps to automate or taken care of by the package. Beyond a certain point, the setup can't be automated. For eg downloading the appropriate Java JDK.

What will be good to know is, can you guys share how do you normally copy files to laptops on your restricted environment without internet? What OS do you use is such environments? This will be helpful to shape the approach to make TagUI for Python work offline, including installing without using pip.

ycx91 commented 5 years ago

Thanks for the above steps. I'll try that out soon.
As you might have already guessed, the environment I'm working on is not fully air-gapped. I am still able to download files but certain sites are restricted and scripts that directly try to do a connection without going through web isolation browsers are forcibly closed.
This means that I am unable to pip install directly but I can pip install from pypi.org by downloading the package and its dependencies first into a folder and then pip installing from that folder.
The typical OS used is Windows.
As for JDK, as it requires admin access to install, we have a list of pre-approved software that can be installed from a central repo.
So generally speaking, anything that requires admin access to do, its a no-go.

kensoh commented 5 years ago

I've tried doing the above steps on my macOS and it works without internet.

Refreshing feeling to switch off laptop wifi for the first time in many years.

For me, the high user friction part will be step 2 and 3. Delta files are not conveniently downloadable unless download whole tebelorg/Tump repo which is large. Otherwise is clicking to view raw to copy out. Step 3 going to Oracle website with so many options and may need to create ID to download is troublesome, but can't be overridden.

Will look out for @ycx91 and @bfgoh inputs on where is the user friction in setting up. So the package can be refined accordingly to minimise user friction in setting up for usage on air-gap envs. (without compromising or inconvenience to the primary users with internet of course).

kensoh commented 5 years ago

Adding on to above comment, @bfgoh setting up TagUI in %temp% folder for Windows is not reliable. Because Windows will periodically delete files from the temp folder and some files in tagui folders will be deleted randomly. That will cause TagUI operation to fail randomly. Not deterministic. Not good.

alexdyysp commented 5 years ago

I believe that it is an offline init problems for tagui. I have tried a lot of way include to overwrite the download function to stop downloading and just unzip local file.But the work is too much, because running t.init() will download many files.So, I gave up, maybe it is impossible for me to use it in my work environment. What's more, I think the problems is serious, A lot of computer environment in companies is not allowed to download in back-end, but I have to develop RPA in the same environment with companys so that I can easily share a ready-to-run rpa.exe with my colleagues. At last, in my opinion, the offline init problems is necessary to be solved, if Tagui want to be more fashion then other rpa develop tools.

kensoh commented 5 years ago

How do you normally install programs and their files into laptops in your company network?

TagUI for Python tries to automate the process of downloading the required files instead of users manually downloading and doing the setup. Assuming there is a way to copy files (eg through thumbdrive or some medium), does the following work for your environments?

  1. pip install tagui on normal computer with internet, run import tagui as t; t.init() (after this step there will be a ready-to-use folder for TagUI program files to be distributed)

  2. copy folder to company computer - Windows %APPDATA%\tagui, macOS / Linux ~/.tagui

  3. copy tagui.py & LICENSE.txt to where you want to use the tool (assuming pip can't use)

This will be the easiest way to use in air-gap environments, by using existing TagUI for Python t.init() (which calls t.setup()) to prepare the files on a computer with internet access. Before copying the folder over to the computer with restricted network access.

If above works for all the restricted network use cases for 3 of you, then I can create something like t.offline(). Which after running on a computer with internet access, will create a folder or zip folder that you can copy to your restricted computer to use.

alexdyysp commented 5 years ago

How do you normally install programs and their files into laptops in your company network?

Company has its own IT Center like a cloud center, only can install software it has stored. It is a method to safe company data and local area network environment, especially in data sensitive company, just like my company. And more, your main customers is in data sensitive industry, such as finance.

So I think it is necessary for you to add a .offline() function Both my boss and me think your tagui is cool and full-functional

kensoh commented 5 years ago

Thanks for your feedback! Oh I don't have customers, TagUI and TagUI for Python are both personal projects that I started to make it easier to do digital process automation. So there is no cost to use or unlock full-features. TagUI is now maintained by AI Singapore, while I maintain TagUI for Python. But yes, a lot of RPA use cases will be in banking & finance.

Hi @ycx91 and @bfgoh , let me know if above method (https://github.com/tebelorg/TagUI-Python/issues/36#issuecomment-515378131) works. Then I'll create a new function, maybe call it t.offline() or something. This function will generate a folder or zip file on your computer with internet access. This folder / zip file can then be copied to your company restricted network laptop to use TagUI for Python.

kensoh commented 5 years ago

Adding on that I'm pending inputs from @ycx91 and @bfgoh on whether the 3 steps in https://github.com/tebelorg/TagUI-Python/issues/36#issuecomment-515378131 work for your environments. As I can't replicate your air-gap environments to test, I only tested by turning off wifi. But I'll need confirmation that the steps work in your environments before I publish a new release having logic that has not been tested to work correctly.

kensoh commented 5 years ago

Nice seeing you Stamford at Python meetup yesterday! Gotta leave to take care of baby right after that didn't get a chance to talk further with you. (for privacy, no need to reply)

kensoh commented 5 years ago

Made commit above to add link to license in tagui.py, as part of facilitating air-gap deployments (without internet access to pip).

This will make step 3 above easier by just needing to copy tagui.py - https://github.com/tebelorg/TagUI-Python/issues/36#issuecomment-515378131

kensoh commented 5 years ago

Added to readme a link to this issue. Making deployments in air-gapped environments (no internet access) have almost 0 user friction is a major goal for this project.

The other major goal is attracting developers in other programming languages to port this project to their favourite language (~1k lines of code with many helpful comments).

Current iteration of best deployment method -

  1. pip install tagui on computer with internet
  2. import tagui as t; t.pack() in Python env
  3. copy tagui.py & tagui_python.zip to target PC
  4. import tagui as t; t.unpack() to deploy

The magic and details of implementation I'll have to take care. Want to find out base on your environment restrictions, is above plan the best way to deploy? Or is there other best practice your team use to make such deployments as pain-free as possible?

alexdyysp commented 5 years ago

Greate api, i will try in my company environment and tell you good news later:)

kensoh commented 5 years ago

Thanks @dyywinner! Let me know, if above is a good idea to deploy on environments without internet, or is there an easier way, then I'll go code the pack() and unpack() from feedback.

alexdyysp commented 5 years ago

Sorry. I cannot find t.pack() and I guess you are ready to code it..... And then, I can not find tagui.py & tagui_python.zip in my computer so i can not copy it. If it does, could you give me your personal contact way, facebook or wechat???

kensoh commented 5 years ago

Yes, I haven't code it, I'm trying to get feedback what is the best practice for deploying software to air-gapped environment without the internet. It seems like below should be the easiest for users. Let me see if I can work on this over the weekend!

  1. import tagui as t; t.pack() on computer with internet
  2. copy tagui.py & tagui_python.zip to the target computer
  3. import tagui as t; t.unpack() to deploy TagUI for Python

I don't have wechat and don't use facebook, but you can add me on linkedin if you use it - https://www.linkedin.com/in/kensoh

kensoh commented 5 years ago

Note - if you cannot or do not want to use pack(), you can instead manually download dependencies with these steps


Ok implemented in v1.14! Available with pip install rpa --upgrade

To deploy on computers without access to internet -

  1. import rpa as r; r.pack() on computer with internet
  2. copy rpa.py and rpa_python.zip to the target computer
  3. import rpa as r; r.init() to deploy and use RPA for Python

This feature is still in beta (then in 2019), I'm open to hearing constraints faced by users so that I can improve the feature to work on their environments as best as possible. The goal of this feature is to make deployment as zero friction as possible on air-gapped computers without internet access.

There may be some OS-specific dependencies that need separate installation - for eg OpenSSL on macOS, MSVCR110.dll on Windows, or PHP on Ubuntu. If users can suggest ways to install them offline in their environments, I can also look into automating those setup.

Note - after deploying to the target computer, if you want to update RPA for Python to a newer version in future, you can use r.update() on computer with internet. It will generate an update.py which you can copy or email to the target computer to perform the update. The benefit of using update() is the generated file is ~100kb instead of using pack() to move a 150MB file over, which is required during the initial installation. More on update() here.

kensoh commented 4 years ago

Adding a above commit which Includes vcredist_x86.exe as part of pack() to streamline deployment on Windows without internet. This lets users enjoy automated setup on the air-gapped computer without having to download vcredist_x86.exe elsewhere and transfer to that computer to install.

This is now available in RPA for Python v1.21 with pip install rpa --upgrade

kensoh commented 4 years ago

Closing issue as there are no issues reported for the past 3 months since this new feature.

ZixunWang commented 4 years ago

Hi,I have met a problem. When I just run the following codes,it came up with an error.

import rpa as r r.init(visual_automation = True) r.type(600,300,'open source') r.click(900,300)

The error came in this way: [RPA][INFO] - setting up TagUI for use in your Python environment [RPA][INFO] - downloading TagUI (~200MB) and unzipping to below folder... [RPA][INFO] - C:\Users\dell\AppData\Roaming [RPA][ERROR] - failed downloading from https://github.com/tebelorg/Tump/releases/download/v1.0.0/TagUI_Windows.zip... <urlopen error [WinError 10060] 由于连接方在一段时间后没有正确答复或连接的主机没有反应,连接尝试失败。> [RPA][ERROR] - use init() before using type() [RPA][ERROR] - use init() before using click()

Well as you can see,I am a Chinese which means I am not able to reach some web in the other countries like 'https://www.google.com'.I wonder if that is the problem?

kensoh commented 4 years ago

Hi Crabby, thanks for posting this issue. Are you using your work laptop or personal laptop? The package will automatically download required files hosted on GitHub. For example, https://github.com/tebelorg/Tump/releases/download/v1.0.0/TagUI_Windows.zip

Are you able to try on your personal laptop or use VPN to see if your network can have access to let the package download files from GitHub?

kensoh commented 4 years ago

As long as your personal laptop or VPN can have network access for the package to download files, you can use r.pack() function to create a zip file for you to copy to another computer to use there without Internet access.

ZixunWang commented 4 years ago

Thanks for your response! I am using my personal laptop. And I see that I actually can not access to https://github.com/tebelorg/Tump/releases/download/v1.0.0/TagUI_Windows.zip

kensoh commented 4 years ago

I see.. I'm afraid that the dependency files are hosted on GitHub, if there is no way for you to access the files, then the package can't be used. Can you check with your friends or try some other laptop or VPN, to check if you can have access to these files? On a laptop with access, you can use r.pack() to create a zip to copy to your computer to use.

How do you and your friends or colleagues share large files to each other through internet?

soorejmg commented 4 years ago

Hi Ken, We get this error when trying to pip install. Is it not possible to deploy in conda-forge channel as well?

WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProxyError('Cannot connect to proxy.', NewConnectionError('<pip._vendor.urllib3.connection.VerifiedHTTPSConnection object at 0x000001DA6F016100>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed'))': /simple/rpa/

Thanks Soorej

kensoh commented 4 years ago

Hi Soorej,

From the message it looks like there is some firewall or connection issue when you try to use pip to install the package. In that case, it is a firewall or network issue which has to be fixed from your computer network by letting IT administrator open up access to pypi.org and where it hosts package files. This way, you can have a permanent fix for all Python packages.

But if you are only concern about RPA for Python, then you can use the pack() and update() function to install this package without internet access and bypass your network firewall restriction - https://github.com/tebelorg/RPA-Python#core-functions

Copy the file to the same folder as where you launch jupyter notebook so that the working directory is where the file is, for you to do import and use.

beAngler commented 4 years ago

The manual way will be for users to -

  1. download and unzip TagUI zip file for their OS https://github.com/tebelorg/Tump/releases
  2. download delta files from https://github.com/tebelorg/Tump/tree/master/TagUI-Python (delta files are those that have remarks 'stable delta files', including tagui.sikuli/tagui.py)
  3. download Java JDK for their OS from Oracle website and Windows PHP dependency
  4. put the files to folder %APPDATA%\tagui for Windows, ~/.tagui for macOS / Linux
  5. for macOS / Linux, chmod -R 755 on the ./tagui folder for execute permissions
  6. create a tagui_python_version file in above folder, eg tagui_python_1.9.1 for tracking delta
  7. copy tagui.py and license file to the folder where TagUI for Python will be used
  8. import tagui as t; t.init(); t.setup(); should work without problems

The question will be how much of above steps to automate or taken care of by the package. Beyond a certain point, the setup can't be automated. For eg downloading the appropriate Java JDK.

What will be good to know is, can you guys share how do you normally copy files to laptops on your restricted environment without internet? What OS do you use is such environments? This will be helpful to shape the approach to make TagUI for Python work offline, including installing without using pip.

Hi ken, we get a problem When run the 'sample.py', It just opens the Chrome browser blank page,has no next step, When I interrupt this process by the keyboard, the following words appears:

**Traceback (most recent call last): File "/Users/hm/develop/git/RPA-Python/sample.py", line 11, in r.init(visual_automation = False, chrome_browser = True) File "/Users/hm/develop/git/RPA-Python/tagui.py", line 550, in init tagui_out = _tagui_read() File "/Users/hm/develop/git/RPA-Python/tagui.py", line 118, in _tagui_read global _process; return _py23_decode(_process.stdout.readline()) KeyboardInterrupt

Process finished with exit code 130 (interrupted by signal 2: SIGINT)**

Thanks hm

beAngler commented 4 years ago
image
jituyadav47 commented 3 years ago

i have used RPA python package on ubuntu lts 18 but i am facing some issues like when called this code import rpa as r
r.init() // it goes stuck here no progress showing and not showing any error please help me

i have installed python3.6.9 and java 8 ,PHP on my linux ubuntu environment
@kensoh please response ASAP

fisher8962 commented 2 years ago

what is means the folder where TagUI for Python will be used? thanks.

kensoh commented 2 years ago

You can use in any folder. Just make sure the rpa.py file is in the same folder as your python script.

kensoh commented 2 years ago

@beAngler and @jituyadav47 sorry I just saw your posts! Did you manage to solve the problems already?