jdherman / ptreeopt

Policy tree optimization: Heuristic policy search for control of dynamic systems. Uses genetic programming to develop binary trees relating observed indicator variables to actions, either real-valued or discrete.
http://www.sciencedirect.com/science/article/pii/S1364815217306540
MIT License
32 stars 7 forks source link

Ideas for improvement - Winter 2019 #4

Closed jdherman closed 5 years ago

jdherman commented 5 years ago

Some thoughts on improving algorithm performance.

  1. Currently there is some evidence of search stalling. My approach is usually to run more random seeds, a larger population size, larger number of parent solutions (lower search pressure), or a lower crossover probability (because 1-cx_prob will replace with randomly generated trees).

Idea from @quaquel : Bring some of the tricks from e-NSGAII to policy trees to counteract stalled search: auto-adaptive population sizing and restarts.

  1. Current parallel implementation only uses MPI but not multiprocessing. Could be an option to switch to concurrent.futures api for executors. This will make it easier to switch between mpi4py on a cluster and multiprocessing on a single machine. (again from @quaquel )

No immediate timeline for doing these things, just keeping track.

quaquel commented 5 years ago

So with respect to point 2, what API do we want? Something like the following makes the most sense I guess.


with MPIExecutor(algorithm) as executor:
    results = executor.run(nfe, logfrequency)

#or
with ProccessPoolExecutor(algorithm) as executor:
    results = executor.run(nfe, logfrequency)

Looking at concurrent.futures, we could make wrappers around the ProcessPoolExecutor, and the MPIPoolExecutor from MPI4py.

Also, what Python version do you want to support? Concurrent.futures has changed between 3.6 and 3.7. The main change is that the ProcessPoolExecutor from 3.7 onwards accepts an initialiser function. If you want to support 3.6, we could use the Pool from Multiprocessing instead while having the initializer functionality. A counter to supporting the initialiser function is that the MPIPoolExecutor currently does not have it.

quaquel commented 5 years ago

I have committed a minimum working example with a ProcessPoolExecutor. See my fork. Will work on it over the coming days some more.

jdherman commented 5 years ago

Wow, thanks @quaquel. Much of this is over my head - I've never used futures or shared memory parallelization in general. Good opportunity to learn stuff!

Your updates look nice and are clearly separated from the algorithm code, which is my only concern. As long as the serial and MPI implementations don't become more complicated, I'm all for it.

For the 3.6/3.7 debate, do you mean that running with MPI will not be possible with 3.7? If so, I'd say we should hold off for now. But I may be misunderstanding the issue.

Thank you again, keep on hacking and let me know how it goes.

quaquel commented 5 years ago

I have added a working MPI version as well. I have, however, a few questions

let me know what you think on these issues.

jdherman commented 5 years ago

Cool! Let's see ...

  1. I have been running from the command line as mpirun -n 32 python main.py. For HPC jobs it needs to be runnable from the command line, but I have no problem if it's just python main.py and MPI is somehow invoked in the script.

  2. Yes the side effects are a new thing I added that I'm not sure about. I wanted to keep track of the number of occurrences for each action (so in the output, every action node will have a % of the time that it is triggered). This helps with pruning unused actions. I did make sure that it works in parallel and returns the modified object to the main process.

  3. Sure, main.py (with the Folsom model example) probably belongs in the example folder. That would be cleaner.

  4. I don't know what the different doc standards are. Numpy sounds fine :)

  5. Sequential version - sounds good!

  6. Yes, this might make more sense. One issue is that I never set up unit tests, so we'll have to make sure nothing breaks when the object structure changes.

Thank you again!! Post-AGU I should be able to put more time in to help with the modifications.

quaquel commented 5 years ago
  1. Excellent, my evaluator code works in the same way so nothing changes
  2. Topic for a later discussion, my guess is that it will be almost unavoidable and thus should be properly documented
  3. Ok, I will reorganize
  4. see https://stackoverflow.com/questions/3898572/what-is-the-standard-python-docstring-format if you want to learn more. SALib also uses numpy from the looks of it.
  5. Will do, is 5 minutes of work
  6. We should add a seperate issue on getting travis set up.

no worries, I have some time and this is relatively easy stuff

quaquel commented 5 years ago

Ok, I have done the following:

One minor change in my own code is that I slightly changed the API:


    with MPIExecutor() as executor:
        snapshots = algorithm.run(1000, 10, executor)

So a couple of things are going on here. First, I want to have executors as context managers. This means that a pool of processes is nicely cleaned up after code execution. Second, I want to pass the executor to the run method of the algorithm rather than have a run method on the executor. The executor class is now essentially a small class with a map method. Moreover, I can now more cleanly use the the attributes on the algorithm directly.

One remaining question: what should the return of run on the algorithm be? Currently you only have a return in case of log_frequency not being None, why not return the final population instead? And should these snapshots not be separate from logging? My guess is you want convergence information.

jdherman commented 5 years ago

Wow, this is pretty slick. I'm impressed how concise the updates are. Thanks!! This is looking much more professional by the day.

Question about the return of algorithm.run ... yes, I didn't set this up very well, I've only run it with logging turned on. The snapshots include best_f, best_P, nfe, runtime saved at the frequency log_frequency, which helps to analyze convergence. The best option is probably what you described:

if log_frequency: print to CLI and save snapshots, return snapshots else: do not print or save snapshots; return the final population.

(When logging is turned off, it is very dumb to not return anything, as in the current setup...)

quaquel commented 5 years ago

I went a slightly different route. In my view convergence information and logging while running are two different things and should thus be handled differently.

Logging For logging, I have replaced the print statements with the default python logging module. See the sequential example or multiprocessing example for how to use this. Python logging is very light weight so the performance hit is quite small.

I still have to run some tests under MPI and fine tune the multiprocessing example a bit. Ideally, it should be possible to see from which specific process (main or any subprocess) a log message originates. This makes debugging parallelisation code much easier. For the workbench I have this up and running so I have done it before.

Convergence For convergence, I have for now added a simply boolean. This makes sure that convergence information is saved at each generation. It might make more sense to also have convergence frequency attribute where you don't save convergence information every generation but only after a specific amount of nfe.

quaquel commented 5 years ago

ok multiprocessing logging now works cleanly. This is basically ready to be merged.

getting logging to work with MPI is left for future work. It is a can of worms that I don't won't to touch right now.

jdherman commented 5 years ago

Thanks Jan! This executor setup is really cool. A few questions / thoughts:

After that we can do the merge, and I'd be happy to work on cleaning up documentation/readme stuff. Thanks again!

quaquel commented 5 years ago
    archive, population = algorithm.run(1000, convergence=False)

    # with convergence
    archive, population, convergence = algorithm.run(1000, convergence=True)
jdherman commented 5 years ago

Ok great, just two things for now then - (1) the convergence frequency, and (2) the return statements. The way you outlined it here seems fine, other than the single-objective case would return one best solution instead of an archive.

Thanks! I'll keep an eye out for a PR. And happy holidays.

quaquel commented 5 years ago

fixed, see documentation of run and examples for how it works

jdherman commented 5 years ago

All resolved by #5 , thanks!