USEPA / EPA_MOVES_Model

Estimating emissions for mobile sources
Other
80 stars 20 forks source link

Master and Workers #45

Closed JiaoyanHuang closed 2 years ago

JiaoyanHuang commented 2 years ago

Hey,

I am reading this document, in the sections related to master and worker run https://github.com/USEPA/EPA_MOVES_Model/blob/master/docs/CommandLineMOVES.md

start cmd /k ant 3workers -Dnoshutdown=1

I am just wonder this will add 3 worker cpus from the same computer using master CPU, right?

I remember in MOVES2014, we can add worker cpus from another computer, is that still allowed in MOVES3, and how to setup the setup.bat to call different computers?

thanks

danielbizercox commented 2 years ago

Hi Jiaoyan,

Yes, that command will start 3 worker processes running on the local computer.

Networking multiple computers to perform a MOVES run works the same in MOVES3 as MOVES2014. You'll just need to make sure that MOVES3 is installed on all computers, then set the sharedDistributedFolderPath directive to be a shared drive location in the following configuration files:

On one computer, launch the run as normal (either via the GUI or the command line). Launch workers on all the other computers with the command you referenced in your message. Note that you'll need to manually terminate the worker processes once the MOVES run completes on the master side.

JiaoyanHuang commented 2 years ago

Daniel,

Thanks for a quick response, I still have few questions:

  1. Should I installed same minor MOVES3 version (eg 3.0.x) on those computers?
  2. If I call cpus from network,
    • for sharedDistributedFolderPath, does this mean the folder for all job/work in progress or done? This should be something like \Mastercomputerlocation\Users\Public\EPA\MOVES\MOVES3.0\sharedwork
    • could you please be for specific here "Launch workers on all the other computers with the command you referenced in your message." How should I change this command? should I give the IP address for each worker computer?

start cmd /k ant 3workers -Dnoshutdown=1

Thank you!

Joey

danielbizercox commented 2 years ago

Yes, you will want the same minor version installed.

The sharedDistributedFolderPath does not need to be in the MOVES directory, but it could be if MOVES was installed on a shared network drive. It could also be something as simple as \\sharedlocation\MOVESSharedWork.

MOVES can take advantage of distributed computing resources simply through this shared directory. Essentially, this is what happens during a MOVES run (please excuse the outdated terminology; I'm just using these terms to be consistent with our other documentation):

  1. When you start a run, MOVES master reads the runspec and generates bundles of work. These bundles are named with the randomly determined master ID (this changes for every run), along with the bundle number and the label "TODO". They are saved in the sharedDistributedFolderPath.
  2. Any MOVES worker that is monitoring the sharedDistributedFolderPath will pick up "TODO" bundles. While it is working on the bundle, it will rename it to "InProgress" so that other workers don't accidentally pick up the same bundle. When it is done, the worker will rename it to "DONE" and then look for more bundles.
  3. The MOVES master will look for "DONE" bundles with its master ID, and then delete the bundle when it is done absorbing the data from it.

So all you need to do to get workers running on different computers is to configure them to use the same sharedDistributedFolderPath as your MOVES master is using, and then start them with start cmd /k ant 3workers -Dnoshutdown=1. You can check to make sure it is working because each worker that you start will create an ID file in the sharedDistributedFolderPath, and the timestamp on this file will update every minute.

JiaoyanHuang commented 2 years ago

Perfect! Thank you!