Closed granawkins closed 2 years ago
If idea 3 will be used consistently with all cases, then I think it would be best available resolution.
The best approach seems to be:
/
and sqrt
, add one for log
: sign(x) * log1p(abs(x))
unfit=False
attribute to Tree
s. After predicting each tree, if the output contains nan
or inf
, set unfit=True
. unfit
trees when scoringunfit
trees from gene_pool
Ok. So removing tree altogether, But I assume adding a new tree to replace each that is removed, so that the populations remain at max.
On 8/19/22 23:22, Grant wrote:
The best approach seems to be:
- Keep the helper fx for
/
andsqrt
, add one forlog
:sign(x) * log1p(abs(x))
- Add an
unfit=False
attribute toTree
s. After predicting each tree, if the output containsnan
orinf
, setunfit=True
.- Skip
unfit
trees when scoring- Remove
unfit
trees fromgene_pool
Every generation starts with tree_pop_max
trees, e.g. 100. If 10 are
unfit, then the remaining 90 are used to generate the next population of
tree_depth_min
.On Sat, Aug 20, 2022 at 1:34 PM Kai Staats @.***> wrote:
Ok. So removing tree altogether, But I assume adding a new tree to replace each that is removed, so that the populations remain at max.
On 8/19/22 23:22, Grant wrote:
The best approach seems to be:
- Keep the helper fx for
/
andsqrt
, add one forlog
:sign(x) * log1p(abs(x))
- Add an
unfit=False
attribute toTree
s. After predicting each tree, if the output containsnan
orinf
, setunfit=True
.- Skip
unfit
trees when scoring- Remove
unfit
trees fromgene_pool
— Reply to this email directly, view it on GitHub https://github.com/kstaats/karoo_gp/issues/84#issuecomment-1221243890, or unsubscribe https://github.com/notifications/unsubscribe-auth/AL7VFKZHPSKR3BIUC3RRMZ3V2B355ANCNFSM55X6RGBA . You are receiving this because you authored the thread.Message ID: @.***>
Let's discuss, as there is another method ...
On 8/20/22 00:34, Grant wrote:
Every generation starts with
tree_pop_max
trees, e.g. 100. If 10 are unfit, then the remaining 90 are used to generate the next population of
- It's the same approach we use to handle
tree_depth_min
.On Sat, Aug 20, 2022 at 1:34 PM Kai Staats @.***> wrote:
Ok. So removing tree altogether, But I assume adding a new tree to replace each that is removed, so that the populations remain at max.
On 8/19/22 23:22, Grant wrote:
The best approach seems to be:
- Keep the helper fx for
/
andsqrt
, add one forlog
:sign(x) * log1p(abs(x))
- Add an
unfit=False
attribute toTree
s. After predicting each tree, if the output containsnan
orinf
, setunfit=True
.- Skip
unfit
trees when scoring- Remove
unfit
trees fromgene_pool
— Reply to this email directly, view it on GitHub https://github.com/kstaats/karoo_gp/issues/84#issuecomment-1221243890, or unsubscribe https://github.com/notifications/unsubscribe-auth/AL7VFKZHPSKR3BIUC3RRMZ3V2B355ANCNFSM55X6RGBA . You are receiving this because you authored the thread.Message ID: @.***>
This was implemented in #85
Can you explain briefly so I can get moving?
On Sun, 21 Aug 2022 at 05:56 Kai Staats @.***> wrote:
Let's discuss, as there is another method ...
On 8/20/22 00:34, Grant wrote:
Every generation starts with
tree_pop_max
trees, e.g. 100. If 10 are unfit, then the remaining 90 are used to generate the next population of
- It's the same approach we use to handle
tree_depth_min
.On Sat, Aug 20, 2022 at 1:34 PM Kai Staats @.***> wrote:
Ok. So removing tree altogether, But I assume adding a new tree to replace each that is removed, so that the populations remain at max.
On 8/19/22 23:22, Grant wrote:
The best approach seems to be:
- Keep the helper fx for
/
andsqrt
, add one forlog
:sign(x) * log1p(abs(x))
- Add an
unfit=False
attribute toTree
s. After predicting each tree, if the output containsnan
orinf
, setunfit=True
.- Skip
unfit
trees when scoring- Remove
unfit
trees fromgene_pool
— Reply to this email directly, view it on GitHub <https://github.com/kstaats/karoo_gp/issues/84#issuecomment-1221243890 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AL7VFKZHPSKR3BIUC3RRMZ3V2B355ANCNFSM55X6RGBA
. You are receiving this because you authored the thread.Message ID: @.***>
— Reply to this email directly, view it on GitHub https://github.com/kstaats/karoo_gp/issues/84#issuecomment-1221422026, or unsubscribe https://github.com/notifications/unsubscribe-auth/AL7VFK63IHJ5GWSWNCH3FI3V2FPDPANCNFSM55X6RGBA . You are receiving this because you authored the thread.Message ID: @.***>
Some of the operators we support will produce unusable values (
nan
orinf
) in the course of normal use:/
**
sqrt
log
log1p
arcsin
arccos
*We currently use helper functions for division and square root which ignore
0
s.What to do?
Here are 3 ideas:
Deal with them case-by-case.
/
andsqrt
seem ok for now.log1p
is a built-in function that extendslog
by ignoring0
s. We could add a helper which doessign(x) * log1p(abs(x))
.arccos
andarcsin
are maybe rare enough, we could add a check inkaroo.fit()
when using them that-1 < X < 1
, else raise a ValueError.**
.X > 1e3
happens frequently with small numbers too when combined with other operators, e.g.2 ** (1 / .001)
. Replacing with 0 is the simplest option, but it's a big nonlinearity (as X increases, outputs get exponentially larger and then drop to 0).Accept a kwarg with a replacement value (e.g. 0) in the case that a
nan
and/orinf
is produced. Basically like we do in the *'s above, for everything.If and when a tree produces a
nan
orinf
, just remove it from the gene pool and don't bother scoring it. This is basically the method used byswim
, i.e. eliminate trees with less than the minimum number of nodes.I lean toward 3.