kstaats / karoo_gp

A Genetic Programming platform for Python with TensorFlow for wicked-fast CPU and GPU support.
Other
157 stars 61 forks source link

Resolving Issue # 23 Unary ops #25

Open rll2021 opened 3 years ago

rll2021 commented 3 years ago

Sorry, my local repo. got confused. I have now placed the modified files in the correct locations.

kstaats commented 3 years ago

Cool. Thank you Richard.

On 2/9/21 8:07 PM, Richard wrote:

Sorry, my local repo. got confused. I have now placed the modified files in the correct locations. You can view, comment on, or merge this pull request online at:

https://github.com/kstaats/karoo_gp/pull/25

-- Commit Summary --

  • Resolves Issue #23: "On unary operators"
  • Delete init.py
  • Delete data_ABS.csv
  • Delete operators_PLAY.csv
  • Delete operators_REGRESS.csv
  • Delete operators_list.txt
  • Delete base_class.py
  • Add files via upload
  • Add files via upload

-- File Changes --

 M RELEASE_NOTES.txt (4)
 M karoo-gp.py (8)
 M karoo_gp/__init__.py (2)
 M karoo_gp/base_class.py (79)
 A karoo_gp/files/templates/data_ABS.csv (12)
 M karoo_gp/files/templates/operators_list.txt (93)

-- Patch Links --

https://github.com/kstaats/karoo_gp/pull/25.patch https://github.com/kstaats/karoo_gp/pull/25.diff

rll2021 commented 3 years ago

I also promised a look at the .pdf documentation.

  1. p. 4 "$ python karoo_gp..." everywhere --> "$ python karoo-gp..."
  2. p. 4 Furthermore, there is no file named "karoo_gp_main.py", or anything similar.
  3. p. 10 Diagram at bottom shows arity 3. This case is known to fail, although it is still in the code.
  4. p. 11 Paragraph 3 through end of page needs to be reworked, removing all discussion about binarizing unary operators.
  5. p. 14 "population_final.csv" --> "population_f.csv"
kstaats commented 3 years ago

Thank you. I have not had a chance to update the docs, fully, since the revised code structure.

On 2/10/21 12:58 PM, Richard wrote:

I also promised a look at the .pdf documentation.

  1. p. 4 "$ python karoo_gp..." everywhere --> "$ python karoo-gp..."
  2. p. 4 Furthermore, there is no file named "karoo_gp_main.py", or anything similar.
  3. p. 10 Diagram at bottom shows arity 3. This case is known to fail, although it is still in the code.
  4. p. 11 Paragraph 3 through end of page needs to be reworked, removing all discussion about binarizing unary operators.
  5. p. 14 "population_final.csv" --> "population_f.csv"
rll2021 commented 3 years ago

Since these changes touch a lot of code, I am holding further pull requests pending merge of this #25.

kstaats commented 3 years ago

Richard,

Since these changes touch a lot of code, I am holding further pull requests pending merge of this #25.

I fully understand and appreciate all you have done to improve Karoo. I look forward to diving into your changes, as soon as I am able.

In the mean time, let's discuss this a bit more, off-line.

Cheers, kai

kstaats commented 3 years ago

Richard,

Per our prior communications, I truly appreciate all the work and effort you have put into Karoo GP. I cloned your branch and ran it through a number of tests, to determine if the overall behavior had in any way changed, as compared to the prior version. While you were successful in enabling the unary operators (cos, sin, log, etc.), some of the basic, built-in tests can no longer be solved, or best case, are solved with far more complicated, evolved solutions. This is due to the introduction of potentially several new layers of () which both change the result of the mathematical expression, and keep it from being simplified to something human readable.

An example is to use the Matching function set to a min/max depth of 5:

ORIGINAL The leading Trees and their associated expressions are: 1 : a + b + c 2 : a + b + c 3 : a + b + c 4 : a + b + c 5 : a + b + c 6 : a + b + c 7 : a + b + c 8 : a + b + c 9 : a + b + c 10 : a + b + c ...

for a perfect score of 10.

MODIFIED Tree 1 yields (sym): c + c/a Tree 2 yields (sym): a - b + c + c/a Tree 3 yields (sym): c + c/a Tree 4 yields (sym): c + c/a Tree 5 yields (sym): c + c/a Tree 6 yields (sym): b/a Tree 7 yields (sym): (a + b)((a + c/a)(-ac - a + c) + (b - c)/(a - c))/((bc + b + c)(ab/(b + c(b - c)(ac + a)) - a + bc - c)) Tree 8 yields (sym): c + c/a2 Tree 9 yields (sym): a - b + c + c/a Tree 10 yields (sym): -a - b2 + b + 2*c + c/a ...

where not a single tree was able to find the correct solution, over multiple runs.

With the Iris Dataset Classification, there is a known, correct solution (see the PDF included in the files/Iris_Dataset PDF).

Also, the NaN and -INF solutions, which are much appreciated may also lead to changed results (in at least one of the tests I ran). Let's take this conversation off-thread and see if we can't find a mutually beneficial solution.

kstaats commented 3 years ago

I have updated and uploaded the revised User Guide. Yes, a number of references to old versions of the code, prior to the prep for PIP install (coming soon!). Thanks Richard!