svalinn / condorht_tools

Tools for the launching,control and consolidation of MCNP jobs on the HTCondor system at UW
BSD 2-Clause "Simplified" License
4 stars 8 forks source link

Code Refactor #31

Open makeclean opened 11 years ago

makeclean commented 11 years ago

When the alpha version is released for use in the group, it would be beneficial to have an experienced python user refactor and tidy the code from its current state.

makeclean commented 11 years ago

The re factored code can now launch Fluka jobs correctly. Due to the limitations of the CHTC system the code was re factored based on the following assumptions.

On the basis of this lead to the following design changes

  1. Preprocessing (as much as it exists) should be done by the user away from Condor, i.e. if the MCNP job must be split this must be done away from Condor for several reasons.
    • We cannot transfer large amounts of cross section data, it is unweildy
    • Due to the advanced tally methodology in place, the output filename from the meshtally must be unique, otherwise data being returned from condor jobs will be overwritten by other jobs, this means that the preprocessing stage must also set a unique filename for advanced tallies.
    • This preprocessing must produce unique input and output data for each calculation to allow recombination
    • This is problematic for very large fiiles which can take several minutes ( to hours) to initialize the runtpe files for running
  2. Since there is the requirement that large files should be transferred via the squid/wget the input data and other ancillary files should be bundled and transferred. This has implications on recombination of the data.
  3. Since large data must be sent/received using squid/wget we cannot flow nicely from the running of all calculations to the production of the collected output data, therefore the production of the averaged output dataset will be done as a seperate post processing step.
makeclean commented 11 years ago

The script looks for a certain directory structure within the run directory, it looks for

  1. input (contains all input decks to run)
  2. geometry (containing the h5m of the geometry to be run)
  3. mesh (containing the h5m of the advanced tally to use)
  4. ancillary ( wwinp files etc)

On the basis of what is passed, the script looks in the directories provided to determine what calculation should be performed, for example

/home/davisa/condorht_tools/chtc_sub/submit_job.py 
--path /home/davisa/fng_str/ --job FLUKA --batch 10 

Tells the script to look in /home/davisa/fng_str/input for the input decks, that its a Fluka calculation and that each calculation should be run 10 times. The script tar.gz's everything whithin /home/davisa/fng_str and copies it to /squid/davisa where the precompiled tar.gz's of the gcc compilers and fluka compilers exist.

The script then build the DyAG graph to control the tasks using the dag_manager. (since we can no longer tag on the post processing as a child of this run the only benefit to using dag_manager is for resubmission of failed runs)

gonuke commented 11 years ago

Does any of this change if we have our own submit machine over which we have full control and disk access? I think that's what all the productive HTCondor users do.

makeclean commented 11 years ago

I don't know for sure, but I don't think so, its the IO of getting all the data to and from the compute nodes that is the issue, which is why we have to put things in squid and then wget them. If we had our own dedicated submit node it would make things easier in the sense a lot of the processing could be done there.

However, at some point we have do deal with the issue of these large files so take for example one of Tim's ITER FW models, he has 2 or 3 advanced tallies in there as well, the model itself takes 10 minutes to read in cross sections, then its another 40 mins to build the kdtree. This preprocessing is done in serial on another machine, which means almost an hour just to build the runtpe file for one calculation, where we may consider splitting into 1000 sub calculations. We can of course parallelise this.

If instead we brought the xs with the calculation, in effect abandoning the idea of a continue run, storing the xs on squid, then this would be several hours of transfer of xsdata before the run begins, which is not much use either since we have several hours of dead time before any useful work is done.

An alternative is to pull out the xs data that is needed for the calculation and build a custom xsdir and ace file for each calculation.

Another issue we have is one of storage, to even get a big ITER calculation onto Condor will take several 10's or even 100's of GB. Which from our perspective isn't really a problem but for the Condor folks who winced when I asked for 30 GB, it is.

gonuke commented 11 years ago

I think having our own submit node will solve at least 2 problems:

. We'll have a little bit more control over our environment for building our tools (although it will still have to be compatible with the execute machines)

. We can put a big hard drive there as a launching/landing pad for the data as it comes and goes.

I think if we're clever, the initial costs of reading data and building the MOAB search trees (do we need both an OBB-tree for DAGMC and a KD-tree for mesh tallies?) is worth it if we can reuse the runtpe for each of the separate jobs.

We should perhaps try do do a 2- to 4-way replication by hand and see what the moving parts actually look like.

makeclean commented 11 years ago

The reuse of the runtpes is the key, and unfortunately I don't currently see how we can reuse them. In normal MCNP use we can, however for advanced tallies we have to ensure the output mesh name is unique, which cannot be reset after the runtpe has been written, hence the need for multiple runtpes. Unless we shift the meshtal setup routine?

gonuke commented 11 years ago

What about different subdirectories?

makeclean commented 11 years ago

Yeah that would probably work, it just seemed a bit messy, but thats preferable to slow.