cavalab / ellyn

python-wrapped version of ellen, a linear genetic programming system for symbolic regression and classification.
http://cavalab.org/ellyn
Other
54 stars 11 forks source link

Add custom primitive functions #2

Open arita37 opened 7 years ago

arita37 commented 7 years ago

Thanks for your work. When doing regression, there are primitive funtions like cos, sin, exp for building the regression tree. Is there a way to add custom primitive functions to the list of primitive functions ?

It would help "accelerating" convergence on specific problems by renormalizing the primitives and reducing the tree depth.

lacava commented 7 years ago

unfortunately we're in the same situation here as we are with ellen since we're using a very similar code base. as I say there,

ellen could be used with self-defined functions. unfortunately it is not very simple to implement your own. first the function would need to be included in Fitness.cpp and referenced in the eval() function. then it would need to be included in the node creation functions which are somewhat spread out among the source code, generally in variation and initialization stages.

the existing operators can be defined using the 'op_list' parameter. for example, this would use constants (n), variables (v), and the mathematical operators with default weights: op_list n v + - / sin cos exp log the full set of operators are n v + - / sin cos exp log sqrt ^ = ! < > <= >= if-then if-then-else & |

edit: myfun would have to be written in c++ for use in ellen. with some development it could be included as a python function by ellyn.

of course it is possible, it just requires edits to Fitness.cpp, node.h, InitPop.h and potentially a couple other files. partially this is due to the 'switch case' design concept I chose for the nodes rather than using polymorphic classes. it makes the code less elegant, but allows the objects of a program to be stored in contiguous memory.

arita37 commented 7 years ago

Thanks for answer.

If one takes one step back, adding custom primitive is fundamental to reach larger set of problems:

1) Allows Re-Normalization of the problem into different projective spaces.

2) Integrate human level information/knwoledge, especially hiearchy structure is better handle by human expert.

3) Reduce complexity search (this is an NP hard problem, so tree level reduction is fundamental).

It looks like it requires a re-design of the project since "hacking/" initial design might be complex and leads to further issues (for reference there is pagmo where they had to re-design all from scratch.... to allow better improvements).

Key technical point is : 1) add static library (like boost) containing only the primitives ??? This library can be compiled in separate from core compute.

2) Linking the core compute to this static for primitive eval ???

  So, how to handle polymorphisme on operators ?
With C++ meta-template/template ?
 generated from primitive header in h

 Think 2 modes might be needed :
    Compile type function 
    Dynamic functiom (slow)

  Pagmo has a mechanism to handle objective function in python and transfer to C++

Happy to discuss overall design.

On 23 Aug 2017, at 00:00, William La Cava notifications@github.com wrote:

unfortunately we're in the same situation here as we are with ellen since we're using a very similar code base. as I say there,

ellen could be used with self-defined functions. unfortunately it is not very simple to implement your own. first the function would need to be included in Fitness.cpp and referenced in the eval() function. then it would need to be included in the node creation functions which are somewhat spread out among the source code, generally in variation and initialization stages.

the existing operators can be defined using the 'op_list' parameter. for example, this would use constants (n), variables (v), and the mathematical operators with default weights: op_list n v + - / sin cos exp log the full set of operators are n v + - / sin cos exp log sqrt ^ = ! < > <= >= if-then if-then-else & |

edit: myfun would have to be written in c++ for use in ellen. with some development it could be included as a python function by ellyn.

of course it is possible, it just requires edits to Fitness.cpp, node.h, InitPop.h and potentially a couple other files. partially this is due to the 'switch case' design concept I chose for the nodes rather than using polymorphic classes. it makes the code less elegant, but allows the objects of a program to be stored in contiguous memory.

― You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

lacava commented 7 years ago

i'm happy to entertain proposed code changes. there was a time I used polymorphic class design for the operators (see commit here). as evolution progresses, the evaluations will slow as individuals have to step out of cache to access newly created nodes in other blocks of memory. this was my main concern. however it would make user-defined nodes easier to implement, i think.

arita37 commented 7 years ago

Yes, you are right, cache management is "problematic" especially with added class / polymorphism. I see two ways (maybe there are more) to add custom primitives : Separate C++ build as static library, linked statically to the core compute ellyn. Customs primitives wil be added "manually" tho this separate library (built from time to time).

Python defined linked through boost.python (or other linkage) :    Here, some example of linkage:
    Arbitraty cost function can be defined in python and linked to C++ core run. 
     https://goo.gl/ubRFFF

    http://pybindgen.readthedocs.io/en/latest/tutorial/#a-simple-example  is also a wrapper over C++

Obviously slower, but ideal for low-scale "experimenting" (if we don't care to let the computer running).

Function Polymorphism in C might hard to achieve... (with limited ressources/time).