Proposed modifications to CfdOF to utilize a remote computer for meshing and simulation tasks.

linuxguy123 commented 1 year ago

I'm thinking of modifying CfdOF so that the user can run meshes and simulations locally or on a remote computer. I'm looking for input before I do anything.

Generating large meshes and simulating them can take considerable time on even the fastest desktop CPU. In some cases hours or even days. Meanwhile the workstation machine is tied up and often not very responsive because the CPU core loads are high and the memory bandwidth is almost entirely consumed. The storage device will also be accessed frequently, further slowing down the local machine.

A better way to handle large meshing and simulation tasks is to off load them to a second (remote) computer, leaving the workstation almost unhindered while they run. Furthermore the remote computer can be optimized to run mesh and simulation loads much faster than a desktop machine might be.

I presently use a remote computer to generate meshes and run OpenFOAM simulations. I do this by editing the mesh and OF case files on my workstation, then copying them to the remote machine, logging into the remote machine with ssh and then running the tasks in the ssh session. I'm doing most of this without CfdOF, manually.

While this approach works, it is cumbersome. One is always moving case files back and forth between the workstation and the remote machine as well as the results from the remote machine back to the workstation. More than once I thought I was looking at a mesh or simulation result that just ran when, in fact, it wasn't the current output. Ugghhh.

I'd like to streamline the use of a remote computer in CfdOF, so that it happens automagically from the CfdOF GUI.

I envision doing this as follows:

1) Add a Remote tab to the CfdOF Preferences GUI for the remote connection parameters and set up. The Remote tab would have a checkbox to allow the use of a remote computer and fields for the remote hostname and username on the remote computer as well as a way to test it the connection. Ie can one successfully invoke an ssh session with the remote computer ?

The Remote tab will also allow setting up OpenFOAM, gmsh, cfmesh and Hisa on the remote machine, just as how it happens on the local computer right now.

I want the remote computer to be set up exactly as the local computer is set up as far as the tools go. That way anything that can be done in a worker thread on the local machine could also be done in a worker thread on the remote machine in a ssh session. With a little tweaking, of course.

For now I will not handle setting up Docker on the remote machine, just because I am not terribly familiar with Docker. I have used virtual environments, but not Docker. I may ask for assistance on this part of the task or leave it unimplemented.

2) In the CFD Mesh GUI I will change the "Write mesh case" button to "Write mesh case locally" and add a "Write mesh case remotely" button. The former will do as it does now. The later will write the mesh case to the working directory on the remote computer.

Likewise the Check Mesh button will also have a local and remote version.

I will also change the "Run mesher" button to "Run mesher locally" and add a "Run mesher remotely" button. The former will do as it does now. The later will mesh the mesh case that has been written on the remote machine and display the progress results in the Status box, just as it does now.

Having local and remote versions of these buttons will allow the user to use CfdOF exactly as it is presently used or with a remote computer as the user decides when working on the project. If the remote machine is busy running a mesh or simulation for another FreeCAD instance, the user may want to run meshes or simulations locally. Having both versions of the buttons allows the user the option of doing either.

Upon completion of the remote meshing task, the results will automatically be copied back to the local machine so they can be viewed (Load surface mesh) just as if they were meshed locally.

3) In the CFD Solver GUI the Write button will have a local and remote version, just as the Mesh GUI did.

I haven't figured out how to eloquently handle Editing the remote files. My best solution so far is to write the local files, allow the user to edit them locally and then have a "Copy to remote" button that will copy the entire (edited) case to the remote machine.

The Run button will have a local and remote version. The results of the remote run will display in the Status box and the Reporting Functions just as they do now.

Upon completion of a remote run the results will be copied back to the local machine, just as if it ran locally. Thus the Paraview button will work exactly as it does now, by using Paraview locally on locally stored results.

The availability of all the remote buttons will be controlled by the "Allow remote computations" checkbox on the Remote tab. If the "Allow remote computations" CB isn't checked, all the remote buttons will be disabled, leaving the user only the local buttons. Thus if the "Allow remote computations" button isn't checked, CfdOF will behave exactly as it does now.

SSH will require a password every time a new session is invoked. I'll ask the local user for the remote user's password the first time a remote operation (via ssh) is needed, but not store it. The ssh session should stay available forever unless something happens to one of the computers, ie gets powered down or goes to sleep (timeout). I'll check that the ssh session is still alive every time before using it. If the ssh session has ended, I'll invoke a new one, in which case I'll have to ask the user for the password again.

I also want to (eventually) handle the case of the workstation going to sleep, or being shut down or the user quitting FreeCAD while a remote task is running. I think I can do this by opening a new ssh connection when CfdOF is restarted and reattaching to the existing task. I'm not sure how I'll handle the data that will be missing while the connection is lost. I may have to buffer it on the remote side and redisplay it on the local side when the connection is resumed.

It seems like a lot of work to allow the workstation FreeCAD session to disconnect and resume, but it allows the user the flexibility to reboot the workstation, shut it off overnight, etc., while a task is running on the remote machine.

I am not calling the local machine a host and the remote machine a server because this isn't a true client/server relationship in the traditional sense. Right now I'm using the terminology of local and remote. I'm open to suggestions on the terminology aspect of it.

Theoretically the remote machine will allow multiple FreeCAD "clients" to connect to it, as long as each FreeCAD connection uses a different directory to store the results in. By allocating each client a subset of the total number of cores, each may have a decent experience, ie realtime but slower compared to having exclusive access to the remote machine. For now the allocation of cores will be done manually by the local user, not automatically.

The highest incarnation of using a remote computer for CfdOF processing would be the use of an AWS server or similar. I'm not saying that the work I would do would achieve this. But I do think it would be a step in the right direction. Right now it is quite cumbersome to use an AWS server for OpenFOAM work, for the same reasons I experienced and mentioned. If the remote functionality of the proposed enhancement were done right it could really streamline working with an AWS server.

Not everyone can afford a dedicated EPYC server beside their desk to run their OpenFOAM tasks. But just about everyone can afford to buy some time on an AWS server or make use of a virtual instance granted to them on a server at their local university. Adding remote processing capabilities to CfdOF would really steamline this workflow.

One of the advantages of using an ssh session to do the work on the remote computer is that just about every OS available already has ssh on it. While we could write a special server to run on the remote computer to interface with CfdOF, it would require special setup on the remote machine and I don't see that it would add a lot of value. If one uses ssh sessions, all one needs to do is ensure that ssh is set up on the remote computer.

I'm hoping to add my code to the existing worker thread routines, adding cases for the remote functionality. The alternative would be to duplicate the existing worker thread routines and create local and remote versions of each. I think there will be a lot of code commonality and having duplicates of each would increase maintenance effort in the future. However, I understand that there is a case for leaving the present routines untouched and developing stand alone remote routines. I await your input on this.

At this point I'm doing this just for me to test the concept. These changes may never be published if I find it too cumbersome in use. I'm asking for input now so that if remote processing does make it to the public that what I'm doing has a good chance of being accepted without making a lot of changes. I'm also looking for better ideas on how it might be done.

Thanks for your attention. Thoughts ? Ideas ? Advice ? Feedback ?

linuxguy123 commented 1 year ago

Right now SnappyHexMesh is only running with one core.

Here is how that can be changed. https://www.cfd-online.com/Forums/openfoam-meshing/168852-parallelize-snappyhexmesh-optimally.html

linuxguy123 commented 1 year ago

process.finished isn't firing because ~~the batch file on the host isn't calling $end~~ the bash shell isn't terminating. The problem is on the host side, nothing to do with my CfdOF code.

If I ssh into the remote host and kill the ssh session process.finished gets called in CfdOF. It took me 4 hours to get to this point. Part of the problem is that I have to run a complete solve in order to test it.

Solved it. Was invoking ssh with ssh -tt me@david... It works properly with ssh -t me@david... Ssh trickery.

This also fixed the issue with processes on the remote host not stopping when the process was killed on the workstation. Two birds with one stone.

Upwards and onwards.

linuxguy123 commented 1 year ago

I'm adding "Copy local mesh case to host" button to the mesh control panel.

Meshing with some meshers is much faster to do on a Ryzen 7 or 9 rather than an Epyc processor because the single core speed of a Ryzen 7/9 is faster than an Epyc processor.

This button, when used with copy_back, will allow the user to mesh on one host which is fast at meshing, have the result copied back to the workstation and then copy it to another host to run the solver on. My Epyc 7601 is 3x faster than a Ryzen 7/9 at running the OpenFoam solver.

The reason I'm adding functionality is because it improves productivity for my use case. I suspect my use case isn't unique. For people purchasing time on an AWS server, they can do all the meshes on their workstation and then upload them to AWS and run the solver there.

It is also useful if the remote host doesn't have a mesher installed.

New Mesh Panel

linuxguy123 commented 1 year ago

Just did another push to my repo. Fixed a few things and added "Copy local mesh to host" and "Delete mesh case."

"Copy local mesh to host" does exactly that - copies a mesh to a host. So the user can create a mesh locally or remotely and have it copied back to the workstation and then copy it to another host for use in a solving run.

"Delete mesh case" removes the mesh case from whatever host is currently selected. Handy for cleaning up a remote host after a run. Does not remove it from the local host unless the local host is selected.

At some point I'll have to update the user guide with an explanation how these work in conjunction with "copy back" and "delete remote results".

I will implement "Add filename to path" soon, maybe tomorrow.

I'm using CfdOF for a project I'm working on and these changes are making a big improvement in my productivity. The code is messy right now, but things work pretty well. I'll clean things up once I get the functionality the way I want. The more I use it the more I tweak. That is why the two new buttons appeared in the mesh panel today.

I can't express how much better it is using CfdOF now. I'm usually running 2 or 3 FreeCAD instances at once. I work on refining the model design in one instance while meshing and solving on remote hosts in the other two. It really speeds up the iterations when you can move processes quickly from one host to another and keep the workstation CPU free for other tasks.

Some people go all out when specifying OpenFOAM hardware, so they can have fast cycle times. I can't ~~afford~~ justify a dual EPYC 9004 setup so I'm improvising with 2 mediocre machines. Older servers don't make good workstations because their single core performance is much poorer than a modern desktop CPU. But desktop CPUs aren't good at OpenFOAM workloads where memory bandwidth is king. The solution is to run OpenFOAM processes remotely, as CfdOF now allows.

It is also cheaper to build 2 mediocre machines than it is to build a really high performance machine. CFD work is usually a process of tweaking and comparing results. 2 remote hosts are more than enough to keep me busy, with smaller and 2D models anyway.

New Mesh Panel

linuxguy123 commented 1 year ago

Add filename to output path is now done. I pushed the code to my repo.

I think this concludes the functional changes I'm going to make (for now anyway.) Changes for the next while will be fixes, clean up and polishing.

Here is a general TODO list for the project:

~~cfmesh doesn't install on remote hosts~~
- some global variable use instead of mesh object parameters
buttons don't enable/disable reliably
these vars should be default object vars and editable -> copy back, delete remote results, add filename to output
number of threads isn't needed for OpenFOAM
1p/n threads hangs cfMesh
all the run commands methods should be grouped together
better programmer doc on what the run commands expect for input (explain ssh magic, etc._
better (actual) checking for status of mesh cases, mesh results, foam case, foam results, locally and remotely
some of my contributions won't work on Windows workstations
enable Docker use on the remote hosts
"path" is used in some places, "dir" used in others
Write a more in depth user guide
Add tooltips to things
notification bell doesn't make sound on my computer
- Lots of TODOs in the code

Overall I am very happy with the way it operates. The code has barnacles at the moment but they are easily cleaned up now that the functionality is in place. I strove to make as few changes to the existing code as possible. Everything should mostly work just as it did before if Enable Remote Processing is turned off.

I await feedback.

oliveroxtoby commented 1 year ago

This also fixed the issue with processes on the remote host not stopping when the process was killed on the workstation. Two birds with one stone.

Would you mind closing the other issue?

oliveroxtoby commented 1 year ago

Right now SnappyHexMesh is only running with one core.

Here is how that can be changed. https://www.cfd-online.com/Forums/openfoam-meshing/168852-parallelize-snappyhexmesh-optimally.html

Are you just referring to remote mode here? Because it can be run in multiple processes in CfdOF by setting the number of processes in the mesh data.

oliveroxtoby commented 1 year ago

I fixed the conflict and the push worked. My repo now has all my changes and Oliver's too. https://github.com/linuxguy123/CfdOF

I need to check that this was Oliver's intended change in TaskPanelCfdMesh.py:

FreeCADGui.doCommand("if proxy.running_from_macro:\n" +
                                 "  mesh_process = CfdConsoleProcess.CfdConsoleProcess()\n" +
                                 "  mesh_process.start(cmd, env_vars=env_vars)\n" +
                                 "  mesh_process.waitForFinished()\n" +
                                 "  proxy.check_mesh_process = CfdConsoleProcess()\n" +
                                 "  proxy.check_mesh_process.start(cmd, env_vars=env_vars)\n" +
                                 "  proxy.check_mesh_process.waitForFinished()\n" +
                                 "else:\n" +
                                 "  proxy.check_mesh_process.start(cmd, env_vars=env_vars)")

I think something might have gone wrong in the merge here. I don't have these lines in the latest version (see here and here).

oliveroxtoby commented 1 year ago

I await feedback.

Were you planning to create a pull request? I haven't seen one yet.

linuxguy123 commented 1 year ago

Are you just referring to remote mode here? Because it can be run in multiple processes in CfdOF by setting the number of processes in the mesh data.

It can run multiple processes for me too, locally and remotely. The problem is/was the combination of process and thread values that caused the issue.

linuxguy123 commented 1 year ago

Would you mind closing the other issue?

Done.

linuxguy123 commented 1 year ago

I think something might have gone wrong in the merge here. I don't have these lines in the latest version (see here and here).

I'll rectify this.

linuxguy123 commented 1 year ago

Were you planning to create a pull request? I haven't seen one yet.

I think people should run/test from my repo for a bit until we do this. I'll keep pulling from Master on the main repo and doing the merges. That way we should be able to eventually merge them again.

luzpaz commented 1 year ago

@linuxguy123 in general you should clone upstream, create a branch, make changes to the branch and then submit said branch as a PR. You made you changes to the master branch of your cloned repo.

linuxguy123 commented 1 year ago

@linuxguy123 in general you should clone upstream, create a branch, make changes to the branch and then submit said branch as a PR. You made you changes to the master branch of your cloned repo

My bad. But Master had the latest development in it.

I have no intent on forking CfdOF. I just wanted my own place to play. I intend to merge once people have used it and are happy. I can branch and merge if people really want that.

Active Branches CfdOF All branches

luzpaz commented 1 year ago

Cloning is a form of forking. It's just more hygenic because it's easier to rebase upstream master to your local master. Then you can resolve conflicts against local master from the branch you're working on.

linuxguy123 commented 1 year ago

I need push permission in order to create a branch. If granted, I'll create a branch for my work and move it over.

adrianinsaval commented 1 year ago

the branch is meant to be on your repo, then you make a PR and it can be merged here, doing your work on your master branch is not bad per se, it's just good practice to make a separate branch (based on master) with a descriptive name.

enable Docker use on the remote hosts

isn't this a matter of the host configuring a specific port for docker? (so you would ssh into the docker container) I don't think this needs changes on cfdof.

I think people should run/test from my repo for a bit until we do this. I'll keep pulling from Master on the main repo and doing the merges. That way we should be able to eventually merge them again.

this does not stop you from creating a pull request, you can keep updating your branch and that'll be included in the pull request, the pull request makes it easier for others to review and comment on the code so I suggest to make a PR anyways. Testing can continue and merge can be held back until you and oliver are happy with it.

mmcker commented 1 year ago

enable Docker use on the remote hosts

isn't this a matter of the host configuring a specific port for docker? (so you would ssh into the docker container) I don't think this needs changes on cfdof.

A very good point! The entire docker implementation could be deleted and replaced by people adding the docker container as a remote host. Previously I recommended removing docker from the remote host code as it would not add value (maybe my email was lost as I can't see it above).

linuxguy123 commented 1 year ago

A very good point! The entire docker implementation could be deleted and replaced by people adding the docker container as a remote host. Previously I recommended removing docker from the remote host code as it would not add value (maybe my email was lost as I can't see it above).

This would need to be tested ! Of course you know that.

linuxguy123 commented 1 year ago

the branch is meant to be on your repo, then you make a PR

Oh ! I thought you wanted me to make a branch in the existing repo. OK, makes sense.

linuxguy123 commented 1 year ago

I fixed the conflict and the push worked. My repo now has all my changes and Oliver's too.

I think something might have gone wrong in the merge here.

You are correct. I'm fixing it now.

There was an error in one of your lines. Missing a "+".

FreeCADGui.doCommand("cart_mesh = "
                                 "    CfdMeshTools.CfdMeshTools(FreeCAD.ActiveDocument." + self.mesh_obj.Name + ")")

linuxguy123 commented 1 year ago

PR: https://github.com/jaheyns/CfdOF/pull/76/files

oliveroxtoby commented 1 year ago

Update: I reinstalled cfmesh and now it works. I spent a couple hours troubleshooting this !

This happens when the stock version of cfmesh is used instead of the modified cfdof one. Dependency checker should tell you straight away.

linuxguy123 commented 1 year ago

I changed the name of my repo. It can now be loaded into FreeCAD using the github link in Addon Manager -> Settings

CfdOF in Addon Manager

~~I think it can co exist with the plain CfdOF, but am not 100% sure. If add filename to path is selected the RP version will save files in a different location than the plain version.~~

The two versions of CfdOF cannot co exist on one installation. I could make them but it isn't worth it.

linuxguy123 commented 1 year ago

How is one supposed to run TestCfdOF.py ?
Is this file up to date ?
I ran it in the CfdOF. Is that correct ?

$ ./TestCfdOF.py
./TestCfdOF.py: line 29: from: command not found
./TestCfdOF.py: line 30: from: command not found
./TestCfdOF.py: line 31: from: command not found
./TestCfdOF.py: line 32: from: command not found
./TestCfdOF.py: line 33: from: command not found
./TestCfdOF.py: line 34: from: command not found
./TestCfdOF.py: line 35: from: command not found
./TestCfdOF.py: line 36: from: command not found
./TestCfdOF.py: line 37: from: command not found
./TestCfdOF.py: line 38: from: command not found
./TestCfdOF.py: line 39: from: command not found
./TestCfdOF.py: line 57: syntax error near unexpected token `('
./TestCfdOF.py: line 57: `home_path = CfdTools.getModulePath()'

$ python --version
Python 3.11.2

linuxguy123 commented 1 year ago

What is the rule for setting the Base Element Size in openFOAM ?

linuxguy123 commented 1 year ago

Another question... will CfdOF work with a cut made with Part Design workbench or does it have to be made with the Part workbench ?

One can make a "cut" of a wing in a wind tunnel 2 ways:

1) In Part Design, one can Pad the wind tunnel sketch and then Pocket the wing sketch to make the part to simulate.

2) In Part, one can Extrude the wind tunnel and wing sketches and then Cut to produce the part.

Will they both work equally well or is one preferred over the other ?

oliveroxtoby commented 1 year ago

How is one supposed to run TestCfdOF.py ? Is this file up to date ? I ran it in the CfdOF. Is that correct ?

Please see https://github.com/jaheyns/CfdOF/blob/master/CONTRIBUTING.md#testing

linuxguy123 commented 1 year ago

Update

I found a bug. If one does a Save As to save the FreeCAD file to a new name, the UI doesn't update the filename and the mesh and solver results get saved to the wrong (old) directory. I'll fix this in the near future.

I've spent the last few days running cases with CfdOF-RP. The workflow from editing objects to meshing to solving to viewing results and running things on multiple computers is fantastic. Very fluid, very little wasted effort.

Meshing in particular is excellent. So easy to set up refinement volumes, run the mesh and then view the results in Paraview. I used to hate meshing. Always figuring out indices and such to put in FOAM files. Figuring out inlet, outlet, wall areas. So much nicer to click on a surface or object.

Some of my meshes are large and are taking 15 minutes to run. With a project like this it is really nice to have the remote computers.

I'm biased, but I'm thrilled with how this turned out.

linuxguy123 commented 1 year ago

Enhancement ideas:

In CFD reporting function:

1) fill the Free-stream flow speed object from the Inlet object as a default

2) Calculate reference pressure as a default. (1/2 rho x V^2)

3) In the Solver Panel:

Allow Paraview to view results from a partially completed solution

4) In the Mesh and Solver Panel:

Add notification via sounds for successful completion and error for meshing and solving.

linuxguy123 commented 1 year ago

I am thinking of adding Save Mesh and Open Mesh buttons to the mesher control panel.

I want these functions so that I can mesh a geometry and then save the mesh for use in multiple projects without having to remesh it.

Thoughts ?

I'm not sure how it would interact with the mesh object. Things are getting a bit complicated with remote hosts and importing meshes.

oliveroxtoby commented 1 year ago

Closing this issue as a pull request has been created.

jaheyns / CfdOF

Proposed modifications to CfdOF to utilize a remote computer for meshing and simulation tasks. #70