jaheyns / CfdOF

Computational Fluid Dynamics (CFD) for FreeCAD based on OpenFOAM solver
GNU Lesser General Public License v3.0
483 stars 91 forks source link

Proposed modifications to CfdOF to utilize a remote computer for meshing and simulation tasks. #70

Closed linuxguy123 closed 1 year ago

linuxguy123 commented 1 year ago

I'm thinking of modifying CfdOF so that the user can run meshes and simulations locally or on a remote computer. I'm looking for input before I do anything.

Generating large meshes and simulating them can take considerable time on even the fastest desktop CPU. In some cases hours or even days. Meanwhile the workstation machine is tied up and often not very responsive because the CPU core loads are high and the memory bandwidth is almost entirely consumed. The storage device will also be accessed frequently, further slowing down the local machine.

A better way to handle large meshing and simulation tasks is to off load them to a second (remote) computer, leaving the workstation almost unhindered while they run. Furthermore the remote computer can be optimized to run mesh and simulation loads much faster than a desktop machine might be.

I presently use a remote computer to generate meshes and run OpenFOAM simulations. I do this by editing the mesh and OF case files on my workstation, then copying them to the remote machine, logging into the remote machine with ssh and then running the tasks in the ssh session. I'm doing most of this without CfdOF, manually.

While this approach works, it is cumbersome. One is always moving case files back and forth between the workstation and the remote machine as well as the results from the remote machine back to the workstation. More than once I thought I was looking at a mesh or simulation result that just ran when, in fact, it wasn't the current output. Ugghhh.

I'd like to streamline the use of a remote computer in CfdOF, so that it happens automagically from the CfdOF GUI.

I envision doing this as follows:

1) Add a Remote tab to the CfdOF Preferences GUI for the remote connection parameters and set up. The Remote tab would have a checkbox to allow the use of a remote computer and fields for the remote hostname and username on the remote computer as well as a way to test it the connection. Ie can one successfully invoke an ssh session with the remote computer ?

The Remote tab will also allow setting up OpenFOAM, gmsh, cfmesh and Hisa on the remote machine, just as how it happens on the local computer right now.

I want the remote computer to be set up exactly as the local computer is set up as far as the tools go. That way anything that can be done in a worker thread on the local machine could also be done in a worker thread on the remote machine in a ssh session. With a little tweaking, of course.

For now I will not handle setting up Docker on the remote machine, just because I am not terribly familiar with Docker. I have used virtual environments, but not Docker. I may ask for assistance on this part of the task or leave it unimplemented.

2) In the CFD Mesh GUI I will change the "Write mesh case" button to "Write mesh case locally" and add a "Write mesh case remotely" button. The former will do as it does now. The later will write the mesh case to the working directory on the remote computer.

Likewise the Check Mesh button will also have a local and remote version.

I will also change the "Run mesher" button to "Run mesher locally" and add a "Run mesher remotely" button. The former will do as it does now. The later will mesh the mesh case that has been written on the remote machine and display the progress results in the Status box, just as it does now.

Having local and remote versions of these buttons will allow the user to use CfdOF exactly as it is presently used or with a remote computer as the user decides when working on the project. If the remote machine is busy running a mesh or simulation for another FreeCAD instance, the user may want to run meshes or simulations locally. Having both versions of the buttons allows the user the option of doing either.

Upon completion of the remote meshing task, the results will automatically be copied back to the local machine so they can be viewed (Load surface mesh) just as if they were meshed locally.

3) In the CFD Solver GUI the Write button will have a local and remote version, just as the Mesh GUI did.

I haven't figured out how to eloquently handle Editing the remote files. My best solution so far is to write the local files, allow the user to edit them locally and then have a "Copy to remote" button that will copy the entire (edited) case to the remote machine.

The Run button will have a local and remote version. The results of the remote run will display in the Status box and the Reporting Functions just as they do now.

Upon completion of a remote run the results will be copied back to the local machine, just as if it ran locally. Thus the Paraview button will work exactly as it does now, by using Paraview locally on locally stored results.

The availability of all the remote buttons will be controlled by the "Allow remote computations" checkbox on the Remote tab. If the "Allow remote computations" CB isn't checked, all the remote buttons will be disabled, leaving the user only the local buttons. Thus if the "Allow remote computations" button isn't checked, CfdOF will behave exactly as it does now.

SSH will require a password every time a new session is invoked. I'll ask the local user for the remote user's password the first time a remote operation (via ssh) is needed, but not store it. The ssh session should stay available forever unless something happens to one of the computers, ie gets powered down or goes to sleep (timeout). I'll check that the ssh session is still alive every time before using it. If the ssh session has ended, I'll invoke a new one, in which case I'll have to ask the user for the password again.

I also want to (eventually) handle the case of the workstation going to sleep, or being shut down or the user quitting FreeCAD while a remote task is running. I think I can do this by opening a new ssh connection when CfdOF is restarted and reattaching to the existing task. I'm not sure how I'll handle the data that will be missing while the connection is lost. I may have to buffer it on the remote side and redisplay it on the local side when the connection is resumed.

It seems like a lot of work to allow the workstation FreeCAD session to disconnect and resume, but it allows the user the flexibility to reboot the workstation, shut it off overnight, etc., while a task is running on the remote machine.

I am not calling the local machine a host and the remote machine a server because this isn't a true client/server relationship in the traditional sense. Right now I'm using the terminology of local and remote. I'm open to suggestions on the terminology aspect of it.

Theoretically the remote machine will allow multiple FreeCAD "clients" to connect to it, as long as each FreeCAD connection uses a different directory to store the results in. By allocating each client a subset of the total number of cores, each may have a decent experience, ie realtime but slower compared to having exclusive access to the remote machine. For now the allocation of cores will be done manually by the local user, not automatically.

The highest incarnation of using a remote computer for CfdOF processing would be the use of an AWS server or similar. I'm not saying that the work I would do would achieve this. But I do think it would be a step in the right direction. Right now it is quite cumbersome to use an AWS server for OpenFOAM work, for the same reasons I experienced and mentioned. If the remote functionality of the proposed enhancement were done right it could really streamline working with an AWS server.

Not everyone can afford a dedicated EPYC server beside their desk to run their OpenFOAM tasks. But just about everyone can afford to buy some time on an AWS server or make use of a virtual instance granted to them on a server at their local university. Adding remote processing capabilities to CfdOF would really steamline this workflow.

One of the advantages of using an ssh session to do the work on the remote computer is that just about every OS available already has ssh on it. While we could write a special server to run on the remote computer to interface with CfdOF, it would require special setup on the remote machine and I don't see that it would add a lot of value. If one uses ssh sessions, all one needs to do is ensure that ssh is set up on the remote computer.

I'm hoping to add my code to the existing worker thread routines, adding cases for the remote functionality. The alternative would be to duplicate the existing worker thread routines and create local and remote versions of each. I think there will be a lot of code commonality and having duplicates of each would increase maintenance effort in the future. However, I understand that there is a case for leaving the present routines untouched and developing stand alone remote routines. I await your input on this.

At this point I'm doing this just for me to test the concept. These changes may never be published if I find it too cumbersome in use. I'm asking for input now so that if remote processing does make it to the public that what I'm doing has a good chance of being accepted without making a lot of changes. I'm also looking for better ideas on how it might be done.

Thanks for your attention. Thoughts ? Ideas ? Advice ? Feedback ?

luzpaz commented 1 year ago

CC @howetuft (RenderWB dev) may be interested in this idea as well

linuxguy123 commented 1 year ago

BTW, I don't do Windows. I'll test remote processing on Linux workstations and remote computers but I won't do any testing on Windows devices. I don't even own a Windows machine. Someone else will have to test and troubleshoot on Windows and MacOS.

howetuft commented 1 year ago

Hello, Yes, it could be inspiring for the rendering of some very complex scenes (although I think that, for Render WB, these will be borderline cases). Good luck, I will follow the code propositions with attention!

linuxguy123 commented 1 year ago

I've cloned the current CfdOF Master onto my computer. I have not created a fork.

Unless someone tells me otherwise, I'll work on things locally before sharing them publicly.

linuxguy123 commented 1 year ago

Yes, it could be inspiring for the rendering of some very complex scenes (although I think that, for Render WB, these will be borderline cases). Good luck, I will follow the code propositions with attention!

How is rendering related to meshing and OpenFOAM ? You'd like to develop something similar (remote processing) for the Rendering WB ?

howetuft commented 1 year ago

Yes, remote processing could be a nice-to-have for rendering, as rendering can be quite CPU/GPU consuming, especially with recent emerging needs in terms of animation (see Movie workbench). Therefore if you explore solutions to export the load on remote workstations in FreeCAD context, that might be interesting.

linuxguy123 commented 1 year ago

I think I will use ParaMiko for the SSH connection.

https://www.devdungeon.com/content/python-ssh-tutorial https://www.geeksforgeeks.org/how-to-execute-shell-commands-in-a-remote-machine-using-python-paramiko/

Ignore the part about having to generate the ssh key. We'll have the user establish an ssh connection with the remote computer from the local computer before using ssh from CfdOF. In modern versions of Linux the first establishment of an ssh connection automatically generates the ssh key. Once the key is generated, the ssh connection can be opened with username and password only.

linuxguy123 commented 1 year ago

Yes, remote processing could be a nice-to-have for rendering, as rendering can be quite CPU/GPU consuming, especially with recent emerging needs in terms of animation (see Movie workbench). Therefore if you explore solutions to export the load on remote workstations in FreeCAD context, that might be interesting.

Interesting.

What is really interesting to me is that "nobody" has done this already for the rendering crowd. I'm not a rendering guy. But I know that rendering is CPU intensive as evidenced by the plethora of Blender benchmarks. I just assumed that an application like Blender would have the ability off load some of the workload to another machine or a rendering cluster.

OpenFOAM is a great tool. But as a stand alone application, it requires a lot of patience to use. The worst part, for me, is setting up the mesh case.

CfdOF changes all that. It really streamlines the OpenFOAM workflow. Everything gets done in one place in one application with a good GUI. Object-> mesh->simulation->visualizing results. CfdOF even displays residuals as the simulation is running and has a couple reporting functions. Of course one will still use Paraview to get into the nitty gritty of the results, but CfdOF handles the processing aspect really well.

Unfortunately, serious CFD work is often computationally intensive. While a lot of work can be done on a workstation, there reaches a point where it is beneficial to use remote processing to get the job done. As of now, https://cfd.direct/cloud/remote-cfd-openfoam/ is the only remote processing UI that I am aware of.

IMHO the next step in streamlining the OpenFOAM experience is to enable remote processing in CfdOF.

linuxguy123 commented 1 year ago

I'll use rsync to move the files between the local and remote computer. https://www.atlantic.net/vps-hosting/how-to-use-rsync-copy-sync-files-servers/

Nothing other than ssh needs to be set up on either machine for it to work.

Rsync may be problematic on Windows machines. https://www.ubackup.com/windows-10/rsync-windows-10-1021.html

In Windows, rsync will require the computer be using WSL. However OpenFOAM doesn't run on Windows natively either. It also requires WSL. So I don't see this being an issue for either the local computer or the remote computer as they'll both need WSL in order to run OpenFOAM.

FYI, OpenFOAM typically runs much slower under WSL than it does on natively on Linux. That might be another reason for CfdOF users to want to do their processing on a remote computer, running Linux instead of Windows.

As I said previously, I'm not a Windows guy.

luzpaz commented 1 year ago

@linuxguy123 What is the benefit of introducing paramiko as an extra 3rd party dependency ? (I'm asking because dependencies create more work for maintainers)

linuxguy123 commented 1 year ago

@linuxguy123 What is the benefit of introducing paramiko as an extra 3rd party dependency ? (I'm asking because dependencies create more work for maintainers)

It essentially makes the ssh connection transparent by handling all the complexity. I could do the project without it.

I understand your concern. Let me test a homemade ssh session wrapper and I'll get back to you on that.

luzpaz commented 1 year ago

I mean, if you want to use it as a proof of concept and then later use a homemade ssh wrapper...that would be cool. The less external dependencies the better.

luzpaz commented 1 year ago

https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#quoting-code

linuxguy123 commented 1 year ago

https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#quoting-code

And when I use triple ticks it still inprets the # char and reformats the code. How do I avoid this ?

luzpaz commented 1 year ago

Test:

#32
linuxguy123 commented 1 year ago

I mean, if you want to use it as a proof of concept and then later use a homemade ssh wrapper...that would be cool. The less external dependencies the better.

From what I understand, paramiko implements an ssh client for a python app to establish a permanent ssh session with the remote computer. One then sends commands to the ssh session to execute on the remote computer.

I do not think that functionality can be accomplished with os.system() or subprocess(). I was hoping that I could run an ssh command with subprocess() and it would remain persistent as long as the ssh command was running. I was hoping to interact with the ssh session via stdin, stdout and stderr into the sub processes.

However, it appears that the ssh process blocks subprocess from receiving or sending text to the remote computer.

Here is my test code.

#!/usr/bin/python
import subprocess
import os

# Say Hi.
print("\nStarting ssh-test.py")

#Test os.system locally
#print("Testing os.system locally")
#os.system('uname -a')

# Start an ssh session with the remote computer
# The remote computer is goliath
# The remote user is me

print('Starting ssh session with the remote computer')

# Approach 1
#subprocess.run(["/usr/bin/ssh","me@goliath"], input="password\n", text=True)
# result: Pseudo-terminal will not be allocated because stdin is not a terminal.

# Approach 1A
# subprocess.run(["/usr/bin/ssh","me@goliath"], input=b"password\n")
# result: same as above.

# Approach 2
#sshSession = subprocess.Popen(["/usr/bin/ssh me@goliath"],  # or ["/usr/bin/ssh", "me@goliath"] if shell=False
#                                stdin = subprocess.PIPE,
#                                stdout = subprocess.PIPE,
#                                stderr = subprocess.PIPE,
#                                shell=True) # also test with shell=False

#print ("stdout:", sshSession.stdout.readlines())
#print ("stderr:", sshSession.stderr.readlines())
#sshSession.stdin.writeline("uname -a\r")
#print (sshSession.stderr.readline())
#print (sshSession.stdout.readline())

#sshSession.stdin.writeline("password\r")
#sshSession.stdin.writeline("uname -a\r")
#print (sshSession.stderr.readline())
#print (sshSession.stdout.readline())
#result: ssh seems to block input from subprocess.stdin.write()

# Approach 3
#sshSession = subprocess.Popen(['/usr/bin/ssh', "me@goliath"], stdout=subprocess.PIPE, stdin=subprocess.PIPE, stderr=subprocess.PIPE)
#reply = sshSession.communicate('password\n')[0] # all variants of this ignore password as an input to the ssh login.
#reply = sshSession.communicate('')[0]
#print(reply)
# result: this accepts the password from the command line.  Can't pass it into the subprocess ?  subprocess also quits as soon as the
# login is successful

print("Exiting...")
print("Bye!\n")

Resources: https://docs.python.org/3/library/subprocess.html https://stackoverflow.com/questions/8980050/persistent-python-subprocess https://stackoverflow.com/questions/48752152/how-do-i-pass-a-string-in-to-subprocess-run-using-stdin-in-python-3 https://stackoverflow.com/questions/8475290/how-do-i-write-to-a-python-subprocess-stdin

https://www.bogotobogo.com/python/python_ssh_remote_run.php <-Note that in all the examples the user has to input the password on the command line. ssh no longer allows the password to be passed in on the command line as an argument. And from What I can tell ssh blocks sending the password in using stdin. This is probably by design to prevent people from hacking into a computer programatically via ssh.

In the absence of getting some version of the above code working, ie creating a persistent ssh session, I will need to use paramiko or a library called fabric.

luzpaz commented 1 year ago

Thanks for doing due diligence. Onward

linuxguy123 commented 1 year ago

What is the easiest way for me to run my version of CfdOF in my current FreeCAD installation ? And debug it using VS Code ?

Is there a "best practices" doc somewhere for FreeCAD workbench developers ?

linuxguy123 commented 1 year ago

Thanks for doing due diligence. Onward

Does that mean you agree that I'll need to use paramiko or fabric ?

linuxguy123 commented 1 year ago

To add a second tab to CfdOF preferences for the remote computing settings, does one call addReferencePage() with the second tab (page) ? Or how is another tab added to preferences ?

"Make sure the addPreferencePage() method is called only once, otherwise your pref page will be added several times" https://wiki.freecadweb.org/Workbench_creation

linuxguy123 commented 1 year ago

I figured out a way to run commands on a remote computer via ssh using subprocess(). I'll update this thread with the details in the near future.

linuxguy123 commented 1 year ago

It's going to be interesting doing this all in QProcess !

But there is this: https://github.com/sandsmark/QSsh

oliveroxtoby commented 1 year ago

I like the idea. We have always had something like this in mind to support compute clusters on which most serious CFD ends up being done, in my experience. In that case you have the additional factor of schedulers to interact with, but that can be a future addition. However, in that vein I would prefer it if we interact with the program running on the remote machine 'at arms length', with the output data obtained by looking at its log files and output files rather than the controlling process directly owning the processes it creates via ssh - because on compute clusters you don't get to create/own the process directly.

  1. Add a Remote tab to the CfdOF Preferences GUI for the remote connection parameters and set up. The Remote tab would have a checkbox to allow the use of a remote computer and fields for the remote hostname and username on the remote computer as well as a way to test it the connection. Ie can one successfully invoke an ssh session with the remote computer ?

Sounds good.

The Remote tab will also allow setting up OpenFOAM, gmsh, cfmesh and Hisa on the remote machine, just as how it happens on the local computer right now.

Great, but perhaps more of a 'nice to have' since the remote machine may not be fully under your control. Presently in CfdOF we write the output so that it can run on any supported OpenFOAM version (i.e. it is not tailored to the version currently selected), with this sort of situation in mind (you might want to copy the case and run it on a cluster with a different version installed).

I also want to (eventually) handle the case of the workstation going to sleep, or being shut down or the user quitting FreeCAD while a remote task is running. I think I can do this by opening a new ssh connection when CfdOF is restarted and reattaching to the existing task. I'm not sure how I'll handle the data that will be missing while the connection is lost. I may have to buffer it on the remote side and redisplay it on the local side when the connection is resumed.

This would be good. For the report functions and residuals etc, the missed data can be harvested from the log and postprocessing files.

Thanks for your interest in taking this on!

adrianinsaval commented 1 year ago

this would be a seriously good addition to the project!

@linuxguy123 What is the benefit of introducing paramiko as an extra 3rd party dependency ? (I'm asking because dependencies create more work for maintainers)

I think paramiko can be installed with pip so it shouldn't really be a problem, the addon manager can offer to install it when you install the workbench so there isn't any additional work for freecad packagers and cfdof itself doesn't need to package it either, just declare it as a dependency so that the addon manager can take care of it.

linuxguy123 commented 1 year ago

Interesting comments. I'm glad to see others feel it would be a worthwhile addition.

I'll keep the cluster and interacting with the remote processs 'at arm's length' ideas in mind moving forward. Personally, I like watching the residuals and outputs in FreeCAD. Makes it easy to kill the simulation if you see something obviously not working.

Thinking about this further, one could take the "at arms length" to the point that FreeCAD starts the process on the remote machine, launches a little stand alone monitoring app and detaches from the remote process completely. The stand alone app would essentially be the residuals and reporting functions which are matplotlib more than anything. Hmmmm.... It would also be nice to watch the core loads in such an app. Baby steps...

Having taken a closer look at the existing code, I'm going to try to do the ssh calls in QProcess like the existing code does. I'll run a test of this aspect of the project in the near future and report back with my findings.

Another change I'm thinking about making is to add the desired number of cores used as a field in both the local and remote meshing preferences and the CfdOF Mesh window.

When remote processing is used, the cores used will probably be quite different between the local machine and the remote machine. I see that the "Number of Processes" and"Number of Threads" properties are available in the Mesh properties, which was has worked fine until now. But these parameters will be constantly changing going forward so I feel they deserve both preferences and edit boxes on the Mesh window.

Keep the comments coming.

linuxguy123 commented 1 year ago

FYI... there are two aspects of running a remote process in ssh that make things challenging. I'm posting this now so people who work on something similar can learn from my experience and understand my implementation decisions.

The first is the permissions aspect. At first I was going to get the user to input a password whenever an ssh session was needed. This doesn't work because ssh itself doesn't allow the password to be passed in on the command line. And if it did, it would appear in the logs on the remote machine.

The way around the ssh permission problem is to set up an ssh key, which is pretty easy.
On the local machine: ssh-keygen to generate a key. Then run ssh-copy-id user@remote to copy it to the remote machine. Unfortunately this will need to be done outside of FreeCAD. But if the user is knowledgeable enough to set up ssh on a remote computer, s/he can also do this step.

The other aspect is interacting with the remote process. I originally thought I'd be able to open an ssh session with subprocess() and utilize the session repeatedly by sending it commands and receiving the output via the stdin, stdout and stderr attached to the subprocess. However, it turns out that ssh blocks the process that is running within the ssh shell from interacting with subsProcess's stdin, stdout and stderr while the process is running. One gets all the output after the process (and ssh) has exited.

I have not found a way around this particular behaviour, even when using paramika.

However, there is a way to attain the desired functionality, ie having stdout and stderr available to the local machine, while the process runs in ssh. One can achieve this by piping the output of the remote process to files on the remote machine and then periodically reading said files from the local machine. It's not as nice as how things run locally, but will still work.

Having said all this, I have not tested running ssh within QProcess. I'm assuming that QProcess has no magical unblocking properties when using ssh and will have the same behavior.

The present code launches all external processes (meshing and OpenFOAM simulations) with QProcess. I will endeavor to do this as well because 1) I want to keep the code the same as much as possible and 2) it appears to work well. If QProcess acts as I think it will with ssh, I may write a wrapper class around QProcess (remoteQProcess?) that hides the complexity of the remote stdout and stderr files and makes it act just like the files are running locally.

Thoughts ?

linuxguy123 commented 1 year ago

The other way to achieve remote processing from FreeCAD would be to write a small server app that sat on the remote machine. This app would call the mesher and OpenFOAM from the remote machine and echo the results back to the local machine. In fact, this server could be installed on the local machine and used to run things there as well.

I suspect that writing that app would be non trivial due to the number of platforms (Linux, Windows, MacOC, etc) that people will have on their remote machines and installing it, setting up the firewall, etc. on the remote machine.

ssh is universally available on all platforms, or nearly so. And it's secure and easy to set up.

Having said all this, I'm open to discussion on the remote approach.

mmcker commented 1 year ago

Have you considered if mosh will help? (keep alive an ssh session)

https://github.com/mobile-shell/mosh

Also, I wouldn't immediately discard using wsl based on speed. I found wsl to be very fast - however, you must ensure you understand how to access the filesystems. https://learn.microsoft.com/en-us/windows/wsl/filesystems. My docker experiments (which uses wsl) seem to have the same speed for wsl as native performance.

linuxguy123 commented 1 year ago

Have you considered if mosh will help? (keep alive an ssh session)

https://github.com/mobile-shell/mosh

The problem isn't keeping the session alive. The issue is interacting with the session through stdout and stderr while a command is running in the session. For example, OpenFOAM may run for hours in an ssh session. We need/want the output of stdout and stderr for OpenFOAM while it is running. So far I've found that subProcess() will only give stdout and stderr after the process (OpenFOAM in this case) has completed.

Furthermore, I have not been able to send commands to an ssh session run in subProcess() using stdin. Others on the Internet have experienced the same as I have. If anyone knows how to do this, please chime in.

Also, I wouldn't immediately discard using wsl based on speed. I found wsl to be very fast - however, you must ensure you understand how to access the filesystems. https://learn.microsoft.com/en-us/windows/wsl/filesystems. My docker experiments (which uses wsl) seem to have the same speed for wsl as native performance.

I'll stick to plain Linux, thank you. I don't run Windows and have no need for WSL.

oliveroxtoby commented 1 year ago

I'll keep the cluster and interacting with the remote processs 'at arm's length' ideas in mind moving forward. Personally, I like watching the residuals and outputs in FreeCAD. Makes it easy to kill the simulation if you see something obviously not working.

I agree, and I wasn't suggesting to lose this functionality, but (as you also mentioned) to monitor the log files rather than remaining attached to the process stdout. This is already how the 'probes' reporting function works - by watching the relevant files in postProcessing/... . I appreciate there is an elegance in keeping everything the same save for the process running through ssh. However every experience I have had of shared compute resources are machines with schedulers where you don't have the option of running heavyweight apps directly. Of course, you are free to take whatever approach works for you, but I do think there would be a lot of benefit in keeping that use case in mind.

linuxguy123 commented 1 year ago

Update: I tested QProcess today running commands within ssh. It appears to work perfectly.

Running the mesh and OpenFOAM processes remotely might be as simple as copying the files over and then running the same commands prefixed with "ssh". With QProcess, Stdout and stderror of the command being run is available to the calling program the same whether the command is run locally or under ssh. This is very impressive. I was not able to achieve this with subProcess().

linuxguy123 commented 1 year ago

I agree, and I wasn't suggesting to lose this functionality, but (as you also mentioned) to monitor the log files rather than remaining attached to the process stdout. This is already how the 'probes' reporting function works - by watching the relevant files in postProcessing/... .

I haven't looked closely at that code. That approach is interesting because as per my testing today, QProcess is very good about passing stdout and stderr to the calling program. Both when processes run locally and when they run remotely under ssh.

So why is the probe reporting done like that ? Why isn't stdout and stderror from QProcess used ?

I appreciate there is an elegance in keeping everything the same save for the process running through ssh. However every experience I have had of shared compute resources are machines with schedulers where you don't have the option of running heavyweight apps directly. Of course, you are free to take whatever approach works for you, but I do think there would be a lot of benefit in keeping that use case in mind.

Could you give me more information on this use case ? Not saying I will or will not handle it but the more I understand the better.

As far as I understand with cluster computing a single call starts the cluster processing the desired command and the results from the command are passed back via stdout and stderr as if the command was being run on a single machine. In fact, the residuals for each iteration in OF need to be passed back to the calling code so they can be examined before the next iteration is started.

If this is the case, the code I'm working on should work with a cluster setup as well.

linuxguy123 commented 1 year ago

Here is the code I tested with. Change mainwindow-cleaned.py to mainwindow.py before running.

QProcess Test.tar.gz

linuxguy123 commented 1 year ago

If you look in CfdConsoleProcess.py, the process is set up the same as how I've set up the process in my test, ie there are handlers for stdout and stderr.

def __init__(self, finished_hook=None, stdout_hook=None, stderr_hook=None):
        self.process = QProcess()
        self.finishedHook = finished_hook
        self.stdoutHook = stdout_hook
        self.stderrHook = stderr_hook
        self.process.finished.connect(self.finished)
        self.process.readyReadStandardOutput.connect(self.readStdout)
        self.process.readyReadStandardError.connect(self.readStderr)
        self.print_next_error_lines = 0
        self.print_next_error_file = False

There is no evidence of peeking into a locally cached stdout or stderr file on the computer.

Running a process remotely may be as easy as copying the files to the remote machine and prefixing the actual call with "ssh".

linuxguy123 commented 1 year ago
def start(self, cmd, env_vars=None, working_dir=None):
        """ Start process and return immediately """
        self.print_next_error_lines = 0
        self.print_next_error_file = False
        env = QtCore.QProcessEnvironment.systemEnvironment()
        if env_vars:
            for key in env_vars:
                env.insert(key, env_vars[key])
        removeAppimageEnvironment(env)
        self.process.setProcessEnvironment(env)
        if working_dir:
            self.process.setWorkingDirectory(working_dir)
        if platform.system() == "Windows":
            # Run through a wrapper process to allow clean termination
            cmd = [os.path.join(FreeCAD.getHomePath(), "bin", "python.exe"),
                   '-u',  # Prevent python from buffering stdout
                   os.path.join(os.path.dirname(__file__), "WindowsRunWrapper.py")] + cmd
        FreeCAD.Console.PrintLog("CfdConsoleProcess running command: {}\n".format(cmd))
        self.process.start(cmd[0], cmd[1:])

For a remote run, change self.process.start(cmd[0], cmd[1:]) to:

remoteCmd = ["ssh", "me@server"]
remoteCmd.append(cmd) 
self.process.start(remoteCmd[0], remoteCmd[1:] )

Am I missing something ?

I will test this with OpenFOAM shortly.

oliveroxtoby commented 1 year ago

I agree, and I wasn't suggesting to lose this functionality, but (as you also mentioned) to monitor the log files rather than remaining attached to the process stdout. This is already how the 'probes' reporting function works - by watching the relevant files in postProcessing/... .

I haven't looked closely at that code. That approach is interesting because as per my testing today, QProcess is very good about passing stdout and stderr to the calling program. Both when processes run locally and when they run remotely under ssh.

So why is the probe reporting done like that ? Why isn't stdout and stderror from QProcess used ?

The code is in CfdRunnableFoam.py, in process_output()

It is done this way because that's where the output is written by OpenFOAM. It doesn't go into the log on stdout. There will likely be other monitoring functions in future that will be the same.

I appreciate there is an elegance in keeping everything the same save for the process running through ssh. However every experience I have had of shared compute resources are machines with schedulers where you don't have the option of running heavyweight apps directly. Of course, you are free to take whatever approach works for you, but I do think there would be a lot of benefit in keeping that use case in mind.

Could you give me more information on this use case ? Not saying I will or will not handle it but the more I understand the better.

A command is issued on the cluster to queue the program for future execution. When it reaches the top of the queue, it is run entirely detached, with stdout and stderr being redirected to files. Other commands can be issued to kill the job or query its running status. And example of such a scheduler is Slurm.

As far as I understand with cluster computing a single call starts the cluster processing the desired command and the results from the command are passed back via stdout and stderr as if the command was being run on a single machine. In fact, the residuals for each iteration in OF need to be passed back to the calling code so they can be examined before the next iteration is started.

Not sure if I'm misunderstanding ... but they don't need to be examined by calling code before the next iteration is started. The output on stdout is just for informational purposes and the computation continues regardless.

oliveroxtoby commented 1 year ago

For a remote run, change self.process.start(cmd[0], cmd[1:]) to self.process.start("ssh", cmd)

Am I missing something ?

I would just worry that the remote process would get taken out if the connection went down.

linuxguy123 commented 1 year ago

The code is in CfdRunnableFoam.py, in process_output()

Thanks for the reply. I'll look at the code and comment later.

oliveroxtoby commented 1 year ago

What is the easiest way for me to run my version of CfdOF in my current FreeCAD installation ? And debug it using VS Code ?

Is there a "best practices" doc somewhere for FreeCAD workbench developers ?

You may find https://gitlab.com/opensimproject/cfdof/-/blob/master/CONTRIBUTING.md of some help.

linuxguy123 commented 1 year ago

I want to set up a tabbed Preference page for the CfdOF workbench. I'm unsure how to do this. I'm working in Python with Qt-Creator.

My best guess is to set up the main preference page as an empty QTabWidget and install that page with FreeCADGui.addPreferencePage() in InitGui.py.

Then create separate pages for each of the tabs in the Preference section for my workbench. Then add the pages to the the QTabwidget in the init() call for the main Preference page.

Am I on the right track ?

Is there a workbench that has a tabbed Preference page that I could use as a reference ?

Approaches that don't work:

1) Making the Preference page a QTabWidget. When I did this, I got the error message: CfdPreferencePage is not a preference page. No other error messages. From this I assume that the preference page must be a QWidget, ie form.

2) Adding the preference pages by calling `FreeCADGui.addPreferencePage("/path/to/myUiFile.ui","CfdOF") twice with 2 different preference page UI files. When I did this, only the first page appears.

3) Adding a Tab Widget to a QWidget Preference Page. When I did this, I got this:

Preference Page with tabs within tabs

While I could make that work, it isn't the desired intent.

So how does one set up a tabbed Preference page for a workbench ?

linuxguy123 commented 1 year ago

I did a little digging and found this in src/Gui/DlgPreferencesImp.cpp:

/**
 * Create a new preference page called \a pageName on the group tab \a tabWidget.
 */
void DlgPreferencesImp::createPageInGroup(QTabWidget *tabWidget, const std::string &pageName)
{
    PreferencePage* page = WidgetFactory().createPreferencePage(pageName.c_str());
    if (page) {
        tabWidget->addTab(page, page->windowTitle());
        page->loadSettings();
        page->setProperty("GroupName", tabWidget->property("GroupName"));
        page->setProperty("PageName", QVariant(QString::fromStdString(pageName)));
    }
    else {
        Base::Console().Warning("%s is not a preference page\n", pageName.c_str());
    }
}

It appears that the only object that can be used as a preference page is a page, ie QWidget.

It also appears that this routine will add the page to tab but will not add a second page to another tab.

It would be really nice if a workbench could have multiple preference pages and if this routine would add each of them as tabs. Or if the routine could accept a QTabWidget in place of its own Tab Widget and let the user then connect his pages to the tabs on his Tab Widget.

In the mean time, the work around is to add a preference page that has tabs on it and implement them on that page. Such an implementation will appear as the window attached above.

linuxguy123 commented 1 year ago

This the code that calls createPageInGroup:

/**
 * If the dialog is currently showing and the static variable _pages changed, this function 
 * will rescan that list of pages and add any that are new to the current dialog. It will not
 * remove any pages that are no longer in the list, and will not change the user's current
 * active page.
 */
void DlgPreferencesImp::reloadPages()
{
    // Make sure that pages are ready to create
    GetWidgetFactorySupplier();

    for (const auto &group : _pages) {
        QString groupName = QString::fromStdString(group.first);

        // First, does this group already exist?
        QTabWidget* tabWidget = nullptr;
        for (int tabNumber = 0; tabNumber < ui->tabWidgetStack->count(); ++tabNumber) {
            auto thisTabWidget = qobject_cast<QTabWidget*>(ui->tabWidgetStack->widget(tabNumber));
            if (thisTabWidget->property("GroupName").toString() == groupName) {
                tabWidget = thisTabWidget;
                break;
            }
        }

        // This is a new tab that wasn't there when we started this instance of the dialog: 
        if (!tabWidget) {
            tabWidget = createTabForGroup(group.first);
        }

        // Move on to the pages in the group to see if we need to add any
        for (const auto& page : group.second) {

            // Does this page already exist?
            QString pageName = QString::fromStdString(page);
            bool pageExists = false;
            for (int pageNumber = 0; pageNumber < tabWidget->count(); ++pageNumber) {
                PreferencePage* prefPage = qobject_cast<PreferencePage*>(tabWidget->widget(pageNumber));
                if (prefPage && prefPage->property("PageName").toString() == pageName) {
                    pageExists = true;
                    break;
                }
            }

            // This is a new page that wasn't there when we started this instance of the dialog:
            if (!pageExists) {
                createPageInGroup(tabWidget, page);
            }
        }
    }
}

It appears that this code should add another page to the group's (CfdOF) tabWidget:

// This is a new page that wasn't there when we started this instance of the dialog:
            if (!pageExists) {
                createPageInGroup(tabWidget, page);

But it checks if the page already exists based not on the filename of the page (ie preferencePage1.ui, preferencePage2.ui) but on the preferencePage.PageName property.

// Does this page already exist?
            QString pageName = QString::fromStdString(page);
            bool pageExists = false;
            for (int pageNumber = 0; pageNumber < tabWidget->count(); ++pageNumber) {
                PreferencePage* prefPage = qobject_cast<PreferencePage*>(tabWidget->widget(pageNumber));
                if (prefPage && prefPage->property("PageName").toString() == pageName) {
                    pageExists = true;
                    break;
                }
            }

If one uses the same page name for the pages being added, they will not be added.

I presume that pages are added by calling FreeCADGui.addPreferencePage("/path/to/myUiFile.ui","CfdOF") with each page.

linuxguy123 commented 1 year ago

I figured out how to add a second tab to a workbench's preference page. I'll update this post with how tomorrow.

linuxguy123 commented 1 year ago

Comments ? Remote Processing Tab

linuxguy123 commented 1 year ago

I'm just starting to write the code behind the Remote Preferences Page. My plan is to heavily reuse the code behind the local Preferences Page, but have 2 completely different code bases. Even though a lot of the functionality will be the same. I'm doing this for a few reasons:

1) I don't want to pollute the code that already works. 2) There will be enough differences with the remote code that a separate implementation may be justified. 3) I'm learning as I go.

When I have everything working we can do a diff between the two code bases and see if it justifies merging them into one.

Speak now or forever hold your peace.

Large parts of CfdTools.py is going to get rewritten as well. For 2 reasons: 1) it uses Python calls that can be used locally but will not be available on the remote host.
2) it uses subProcess() to do things instead of QProcess(). As previously explained in this thread, subProcess doesn't work with ssh commands, whereas QProcess does.

So I'll probably have a separate source for the remote CfdTools.py as well, ie CfdRemoteTools.py.

linuxguy123 commented 1 year ago

I have 2 questions:

1) Can I assume that all remote hosts (and all local hosts) would run POSIX compliant commands ? MinGW, Windows with WSL, Linux, MacOS ? I'm asking because I think I'll be rewriting a lot of the Python os. stuff that runs locally with POSIX commands so that they'll run on the remote hosts.

2) What IDE and debugger are people using for Python and how does one set a breakpoint in the workbench code if one is running it from within a pre compiled FreeCAD release, ie 0.20-1.

Can one attach the debugger to the FreeCAD process and is it smart enough to know when it is at a breakpoint within the CfdOF code being worked on ? In which IDE can this be done and how ?

Thanks

linuxguy123 commented 1 year ago

The About Remote Processing button will display a document for the user to read and use. Understanding remote processing and setting up a remote computer will be a new thing for some CfdOF users. Without good user documentation and testing tools, remote processing will be a nightmare to support and troubleshoot.

This is the start of the document I intent to display. I am writing this document as I write the code for remote processing. Your input is welcome.

Background

Generating large meshes and simulating them in OpenFOAM can take considerable processing time on even the fastest desktop computer. In some cases hours or even days. Meanwhile the FreeCAD workstation machine is tied up and often not very responsive because the CPU core loads are high and the memory bandwidth is almost entirely consumed. The storage device will also be accessed frequently, further slowing down the local machine.

A better way to handle large meshing and simulation tasks is to off load them to a second (remote) computer, leaving the FreeCAD workstation almost unhindered while they run. Furthermore the remote computer can be optimized to run mesh and simulation loads much faster than a desktop machine might be.

The CfdOF is now capable of allow the user to use a remote computer to process meshes and run OpenFOAM simulations. It does this by using the Secure SHell (ssh) command to run the mesher and OpenFOAM on the remote computer. A few things are required to accomplish this:

1) The remote computer must be accessible to the local computer via a network connection. 1) ssh must be installed on the remote computer. 2) The FreeCAD user must have access to a user account on the remote computer 3) an ssh key must be generated and stored on the remote computer such that the FreeCAD user can connect to the remote computer using ssh without using a password. 4) all the relevant meshing and simulation software must be installed on the remote computer. 5) the remote computer must support POSIX file system commands such as ssh, rsync, etc.

The purpose of the Remote Computing tab in CfdOF preferences is to help the user set up and test each of the various components listed above.

Terminology

For the purpose of this discussion the FreeCAD workstation will be referred to as the local computer. The terms "workstation", "FreeCAD workstation" and local computer can be used interchangeably.

Also, the terms computer and host will also be used interchangeably. Strictly speaking a computer is any computing device whereas a host is a computer connected to a network that can be reached by its IP address or hostname. In this discussion we assume that all computers are connected to a network and have an IP address and a hostname.

About ssh

"The Secure Shell Protocol (SSH) is a cryptographic network protocol for operating network services securely over an unsecured network." https://en.wikipedia.org/wiki/Secure_Shell As such it will allow a FreeCAD user to run meshing and OpenFOAM applications on the remote computer from accross the network, without actually being at the remote computer.

The remote computer can be any computer capable of running ssh and the meshing and OpenFOAM applications that supports POSIX filesystem commands. Generally this is any computer running Linux, MacOS or Windows with WSL or MinGW installed. Hardware wise the remote computer could be almost anything - a laptop, a desktop PC, a server or a cloud computer such as an AWS instance.

Remote Computer Setup

In order to allow CfdOF to use the remote computer for remote processing, it must be set up as follows:

Step 1) Log into the remote computer and verify that ssh is installed and running. On a linux computer one can do this with:

$ ps aux| grep sshd root 1240 0.0 0.0 14024 8436 ? Ss 07:38 0:00 sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups

Step 2) Log into the remote computer via ssh from the local computer. On a linux computer one can do this from a console on the workstation with the following command:

$ssh @

Where username is the name of the user on the remote computer, not the FreeCAD workstation and IP address is the IP address of the remote computer. One can also use the hostname if the DNS server used on your network supports hostname resolution, ie dnsmasq is enabled. For example, if the remote computer's hostname is goliath and goliath's IP address is 192.168.2.159, one can do either of the following:

$ssh me@192.168.2.159 $ssh me@goliath

If an ssh key is not set up for the user on the remote computer, ssh will ask for the password for the user account on the remote machine. This password is not to be confused with the password used on the FreeCAD workstation.

In either case the host computer should respond with a ssh shell in which you can issue commands to run on the remote computer.

me@goliath$ ls -l total 0 drwxr-xr-x. 1 me me 20 Feb 14 2022 Desktop drwxr-xr-x. 1 me me 0 Feb 14 2022 Documents drwxr-xr-x. 1 me me 42 Feb 14 2022 Downloads drwxr-xr-x. 1 me me 0 Feb 14 2022 Music drwxr-xr-x. 1 me me 28 Feb 14 2022 openfoam drwxr-xr-x. 1 me me 0 Feb 14 2022 Pictures drwxr-xr-x. 1 me me 0 Feb 14 2022 Public drwxr-xr-x. 1 me me 0 Feb 14 2022 Templates drwxr-xr-x. 1 me me 0 Feb 14 2022 Videos

To exit an ssh session, type exit:

$exit

Step 3) Set up ssh keys.

ssh keys are used to allow the use of ssh without promptng the user for a password. Remote computing in CfdOF requires that ssh be used with keys. Each workstation will need its own key for each remote computer that it uses, if it uses more than one remote computer.

To generate an ssh key for the local computer, use the following command on a local console shell on the workstation computer: (Do not run this from within an ssh session !)

$ssh-keygen

Do not use a passphrase. TODO: explain this.

To copy the key over to the remote server, run this command on a local console on the workstation computer:

ssh-copy-id username>@<remotecomputer

It will prompt you for the password of your user account on the remote machine. It will then copy the ssh key you generated on the workstation over to the remote computer.

You should now be able to login to an ssh session on the remote computer without being prompted for a password.

ssh username>@<remotecomputer

If you are prompted for a password when you log in your ssh key is not set up. You'll need to remedy this before proceeding further.

Note: ssh requires a unique key for every host/remote compute pair. ssh identifies computers by their hostname and IP address. If you generate your ssh key using an IP address and the IP address of either the local machine or the host changes, you will have to generate a new ssh key for the new address pair. To avoid this problem, use fixed IP addresses whenever possible or use hostnames instead of IP addresses.

Likewise if you change the remote computer you use you'll have to copy the generated ssh key to the new remote computer with:

$ssh-copy-id

Once you can reliably log into an ssh session on the remote computer from the local computer using an ssh key the remote computer is ready to be used by CfdOF for remote processing.

CfdOF Remote Processing Setup

Once ssh is set up on the remote computer, one can setup the rest of remote computing from the Cfdof Remote Computing preferences page. (FreeCAD->Edit->Preferences->CfdOF->Remote Computing)

Step 1) Click "Enable Remote Computing

Step 2) Input the host name or IP address of the remote computer to be used for remote processing.

Step 3) Run "Ping Remote Host" to verify that the hostname or IP address is correct and that the local computer can reach the remote computer over the network. You should see " pinged successfully in the Output field below.

Step 4) Input the username of the user account to be used on the remote computer.

Step 5) Run "Test SSH" to verify that ssh has been set up correctly on the remote computer. You should see "ssh verified with " in the Output field below.

Step 6) Run remote host dependency checker.

The output of this process will show the user what meshing and OpenFOAM applications have and haven't been installed on the remote computer and where they are if they have been installed.

To be continued...

linuxguy123 commented 1 year ago

Is there a way to reliably reload a workbench in FreeCAD without restarting FreeCAD?

When I make a change to my code, I'm currently having to restart FreeCAD to load the changed code and test it. Is there a way to reload the workbench such that I don't have to restart FreeCAD ?

linuxguy123 commented 1 year ago

Not sure if I'm misunderstanding ... but they don't need to be examined by calling code before the next iteration is started.

By "calling code", I mean the code that started the process on the computer that is running the process.

In the cluster situation the individual cluster host results are reassembled back to the "calling code", wherein the residual is examined before the next iteration.

linuxguy123 commented 1 year ago

I would just worry that the remote process would get taken out if the connection went down.

In Linux one can set up the remote process to continue running even if the ssh session shuts down. One can then reconnect to the remote process with a new ssh session. I'll update this post with details on how this is done later. I've done this with my own OpenFOAM remote processing.

We can use this technique to allow CfdOF to reconnect to the remote process if it loses connection or if the FreeCAD user exits FreeCAD. It will be a bit tricky to figure out where FreeCAD was at when it issued the process.