ioam / topographica

A general-purpose neural simulator focusing on topographic maps.
topographica.org
BSD 3-Clause "New" or "Revised" License
53 stars 32 forks source link

optimized learning and response functions for GCAL model with modified CFProjection #678

Open dancehours opened 6 years ago

dancehours commented 6 years ago

Hi, I want to implement a modified CFProjection class for GCAL model. I changed some code for e.g. learning function and response function and connection fields. The learning function uses modified class CFPLF_Plugin(CFPLearningFn). The response function uses modified class CFPRF_Plugin(CFPResponseFn).Then the modified GCAL model can work for non-optimized components but the problem is the running is very slow. I checked that GCAL model uses e.g. responsefn.optimized.CFPRF_DotProduct_opt() and learningfn.optimized.CFPLF_Hebbian_opt()to speed up the code. So I suppose I also need to modified these optimized functions for the modified connection field settings. Can you give some suggestions for the necessary changes ?

jbednar commented 6 years ago

Right; CFPLF_Plugin and CFPRF_Plugin make it easy to write custom learning and response functions, but they will be much slower. Nowadays I'd probably use Numba or at least Cython to do optimized functions, but we haven't set that up in Topgraphica, and so unless you want to go into uncharted territory, if you want to speed things up you'll need to copy those two optimized components, rename the copies, and try to adapt them to your needs. Unfortunately, they are quite difficult to follow, as they are written in very low-level C code, but if you patiently work through them comparing the optimized code to the non-optimized version that we provide for reference, you should be able to see how the optimized code implements the same operation but at a much lower level that works much faster. If you don't know C, you'll probably want to get some help from someone who does.

Or you can try to resurrect the Cython versions of the optimized components that are probably somewhere in the Topographica repository; those versions should be nearly as fast and vastly easier to understand and modify. If you want to go that route @ceball can probably find where the latest Cython versions were.

philippjfr commented 6 years ago

Or you can try to resurrect the Cython versions of the optimized components that are probably somewhere in the Topographica repository; those versions should be nearly as fast and vastly easier to understand and modify.

I wouldn't call it resurrecting, they are in perfect working order and are just as fast (or faster) than the previous C extensions. When I was still working on Topographica I used them exclusively. You'll be able to find them in the optimized module.

jbednar commented 6 years ago

Oh, excellent! Glad they didn't die out. :-) Those should be much easier to work with than the C-optimized versions. Good luck!

dancehours commented 6 years ago

Hi, I didn't see what are the big differences between CFPRF_DotProduct_cython and CFPRF_DotProduct_opt, why did you think the cython classes are much easier to modify ? @jbednar @philippjfr

dancehours commented 6 years ago

If I need to modify, e.g. dot_product function, I should modify that in optimized.h, then compile the files by using command "python setup.py build_ext", right ?

jbednar commented 6 years ago

We have a lot of versions of this crucial object:

  1. CFPRF_DotProduct will be the easiest to modify, as it's only got two lines of actual code.
  2. CFPRF_DotProduct_opt is the most difficult to modify; it's 50 lines of raw C code that is very hard to follow, but has the best performance.
  3. CFPRF_DotProduct_cyopt is a straight translation of (2) into Cython, with approximately the same performance but no better readability of maintainability. I don't think this version is currently in use.
  4. CFPRF_DotProduct_cython is a different Cython approach than (3) that basically just adds type information to (1) and gets nearly as good performance as (2) and (3) but with only 11 lines of code.

I think (4) is the one that Philipp used in practice, so it should be in good shape, and that one should be a good starting point, as it has good performance (unlike 1) and is reasonably readable and maintainable (unlike 2 and 3). Hope that helps!

jbednar commented 6 years ago

I still don't see what you mean; CFPRF_DotProduct_cython and CFPRF_DotProduct_opt really aren't very similar (click on the links above to see); one of them is pages of very specific pointer-based C code, and the other is a short bit of Pythonic code with types (not C).

philippjfr commented 6 years ago

I still don't see what you mean; CFPRF_DotProduct_cython and CFPRF_DotProduct_opt really aren't very similar (click on the links above to see); one of them is pages of very specific pointer-based C code, and the other is a short bit of Pythonic code with types (not C).

You'll see that CFPRF_DotProduct_cython calls dot_product which is indeed just a port of the original code and no more simple. The only reason to use Cython is that weave is no longer supported by anyone.

philippjfr commented 6 years ago

If I were writing this stuff today I'd probably just replace these functions with Numba.

jbednar commented 6 years ago

Ah, I thought the dot_product there was being imported from elsewhere, and was only looking at the main function. Indeed, then, there isn't much difference.

Definitely, Numba would be the way to go now. But I wouldn't want to try mixing Numba, Cython, and Weave in the same codebase, the way the code is currently organized; instead I'd disable or delete all but the unoptimized, then see if Numba can optimize that directly with similar or better performance. If only we had had Numba at the time we wrote this in 1993!