ioam / topographica

A general-purpose neural simulator focusing on topographic maps.
topographica.org
BSD 3-Clause "New" or "Revised" License
53 stars 32 forks source link

Change to allow OpenMP to be enabled from command line before running a .ty script #526

Closed acwong closed 11 years ago

acwong commented 11 years ago

To allow OpenMP to be enabled before running a .ty script from command line, code change was made such that "openmp=True" parameter is added to the command line. This is to also ensure if some models may require the original default behaviour. Change has been tested with passing parameter to the command line e.g. "topographica -p openmp=True tiny.ty" or "topographica -p openmp_threads=2 -p openmp=True tiny.ty". Note that with this change, "openmp" parameter should be the last of all openmp related parameters to be defined in the command line.

Please advise if documentation needs to be updated accordingly and the best spot to do so.

Background: OpenMP was only enabled after running the .ty script as the default behaviour.

jbednar commented 11 years ago

Thanks for submitting this! It looks like a reasonable approach to making OpenMP get set up before the .ty file is executed. Before I merge this, just so there is a record of why this was important to do, can you explain why this came up? Was it so that you could parallelize the initialization steps performed in the .ty file? If so, that's something we want to do anyway, so that's great. Or was it because you actually run some simulation time steps in your .ty file? If so, I don't recommend that, because it ties your .ty file to a specific simulation run, but I guess it's legal. :-) Or was there some other reason?

acwong commented 11 years ago

Hi, James,

Thanks for checking. The intention is not to have the run tied to or implemented in the .ty file. It appears there are at least two scenarios that would benefit from having OpenMP enabled before the .ty script is executed.

1) When preparing and initialization a relatively large network such that preparation of each CFSheet and Projection would involve more heavy numerical operations using numpy. So, yes to your question that it would help parallelizing the initialization steps.

2) When running in batch mode without starting interactive console, OpenMP is not enabled at all without this option until only after the topo has completed running simulation.

jbednar commented 11 years ago

Ok, thanks. Using OpenMP during intialization makes good sense, given that initialization often takes a significant fraction of the overall runtime (particularly when many random numbers need to be generated), but unfortunately OpenMP only works for C code that has explicitly had OpenMP pragmas added to it. None of our initialization is done using C code, and so I don't think there will be any speedup from enabling OpenMP during initialization. Still, until we enable it during intialization no one will be able to add OpenMP support, so let's go ahead and do it. For batch mode, OpenMP is most definitely needed, and I'm astonished to find that it hasn't been so far, so again let's go ahead and do it. So I'll merge your pull request now.

jbednar commented 11 years ago

Actually, I'm not sure I can merge it just yet. I took another look at it, and it looks like it requires openmp=True to be supplied explicitly, which must have been what you meant when you mention that you tried to preserve backwards compatibility. Instead, we want OpenMP to be enabled by default, even and especially for batch runs, which means that we have to look at the way we're parsing command-line arguments for OpenMP in general. We'll get back to you on that!

mjabri commented 11 years ago

Actually you raise a very interesting point. I am curious on to how Topo is OMP-aware today as I noticed good exploitation of multi core during large network simulations. It also raise the question of the c-optimized code like DoProduct_opt and Hebbian_opt. These functions do not have any OMP pragmas in them and I am wondering whether actually using the equivalent numpy-based operation would not be more exploiting to OMP. But when I had a look, they do not seem totally equivalent to the python “un-optimized” functions.

jbednar commented 11 years ago

The c-optimized code like DoProduct_opt and Hebbian_opt does have the OMP pragmas, but they are hidden in the magic "%(cfs_loop_pragma)s" expansion. Simulations based on those two functions should be very well OMP optimized, but initialization will be single-core in all current code. The optimized ones are only one-way equivalent: the unoptimized Numpy-based ones can be used in place of the optimized ones, but not vice-versa -- typically the unoptimized Numpy-based versions will be more general but lower performance. I'd be surprised if any of the unoptimized ones exploited OMP better or had better performance in any way, but it's easy enough to check it for your application and see!

acwong commented 11 years ago

Thanks and yes, it is what I meant for the backward compatibility consideration. I think it should make sense to have OpenMP enabled from the beginning by default provided that there is no issue of existing implementation done by people. I will look forward to your update.

mjabri commented 11 years ago

That’s great to know. I have looked at inline.py and it is much clearer now. I was wondering which C interface it is using and now I know what Weave is! Going back to the initialization phase, it would be good to speed that up.

From: James A. Bednar [mailto:notifications@github.com] Sent: Tuesday, July 23, 2013 8:57 PM To: ioam/topographica Cc: M. Jabri Subject: Re: [topographica] Change to allow OpenMP to be enabled from command line before running a .ty script (#526)

The c-optimized code like DoProduct_opt and Hebbian_opt does have the OMP pragmas, but they are hidden in the magic "%(cfs_loop_pragma)s" expansion. Simulations based on those two functions should be very well OMP optimized, but initialization will be single-core in all current code.

The optimized ones are only one-way equivalent: the unoptimized Numpy-based ones can be used in place of the optimized ones, but not vice-versa -- typically the unoptimized Numpy-based versions will be more general but lower performance. I'd be surprised if any of the unoptimized ones exploited OMP better or had better performance in any way, but it's easy enough to check it for your application and see!

The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.

— Reply to this email directly or view it on GitHub https://github.com/ioam/topographica/pull/526#issuecomment-21462321 . https://github.com/notifications/beacon/ZtPUYVGEUarwe5h3C9X6pd8AHUSMcDdzUmWBPG1PF9DeBhsVGhfCBoB3CfxJFXA5.gif

jbednar commented 11 years ago

Speeding up initialization is very important but also very tricky, and I don't think any of us are currently working on that. The issues are that (1) we want to maintain the same random values regardless of how many processors are used, which adds a lot of constraints (but is ultimately doable), and (2) the initialization code (in the ImaGen module) is all Python/numpy, and thus there are no C loops to add OpenMP directives to. These are solvable issues, but because we've been working on Cython-based replacements for the C code, we are all reluctant to put effort into creating new C-based code. But we'd be happy to consult with anyone who wants to volunteer to work on this topic, especially in Cython, as improving startup times would be a big improvement in Topographica's usability.

jbednar commented 11 years ago

Sorry for leaving this unattended so long; our three main developers (including myself) have all been travelling extensively all summer, and are only now getting back on top of Topographica issues. And thanks again for raising this important issue.

After several discussions and brainstorming sessions, I think the situation is finally clear. First, you are correct that there is a problem in the current code -- only once command-line processing is completed does Topographica process the various OpenMP parameters and print the usage message saying that OpenMP is enabled. For interactive sessions this is usually ok, but for batch mode the message will come only after any training specified in commandline arguments.

On the other hand, it turns out that there is only a problem with the parameters and the message -- OpenMP is still enabled, regardless of when we print the message! OpenMP starts out enabled and using all cores by default, and thus batch runs have been using all cores, regardless of what our message says (which comes too late to be meaningful anyway). So it doesn't matter what options the user supplies on the command line or in their topographicarc file, they will only take effect on interactive runs (which unfortunately was where we debugged this!).

In any case, the pull request as written won't solve the problem, because OpenMP is always still enabled anyway, whether or not the user supplies the new "openmp=True" parameter; the only difference is whether Topographica's options are processed properly or not.

To fix the problem, we came up with three possible solutions:

  1. Give up: Remove all Topographica-specific code trying to set the OpenMP options, and rely on users to set the OpenMP environment variables externally as they wish. This is what most OpenMP programs do, and is the simplest option for us. We'd then print the message about the number of cores in use at the very start, and users would be able to quit and change the settings if they don't like them.
  2. Try to make the current code work, by moving the parameter setting code into a function (just as you have done here), and then calling that function to update the settings just before any .ty file or .py file or -c command is executed. We think it is possible to get this to work, but the code will be complicated and error-prone, because we have to catch every possible place a CPU-intensive operation might occur, while not doing it prematurely when parameters are still being set by the user. We'd also have to include special code so that the user message about how many cores are in use is printed only once, unless the value changes later.
  3. A slightly simpler version of 2. would be to require that OpenMP settings come first on the command-line, before any .py file, .ty file, or -c command, and not allow them to change after that. That would simplify our message-printing logic and maybe some other parts of the code, because we wouldn't have to reason about so many alternatives, but it still leaves some complicated code and makes us have to require the user to work a certain way that's not necessarily any easier than option 1 for them.

In discussions with Jean-Luc (the author of the original code), we've decided that we favor option 1, because we don't actually use these parameters much in practice, and we can live with the default OpenMP behavior of using all cores by default. We've really already been doing that accidentally, and to get a cleaner, more straightforward code base, we'd prefer just to strip out all of the complicated logic and leave everything up to OpenMP.

So we'll do option 1 very soon, unless we get a strong objection from someone who thinks that the parameters we provided are really important and greatly improve usability.

jbednar commented 11 years ago

Fixed in commit e471386ba10e00e70da91eeda2b8a80f6fb6afd7; all openmp options in Topographica have been removed, in favor of using the environment variables.