davidrmiller / biosim4

Biological evolution simulator
Other
3.21k stars 460 forks source link

Request Challange Files from Youtube Video #86

Open PascalCorpsman opened 2 years ago

PascalCorpsman commented 2 years ago

Dear Dave, i ported your biosim Application to FPC ( https://github.com/PascalCorpsman/biosim4_FPC_translation ) to be able to adjust your code with my own needs in my most favorite programming language (and also to be able to run it on Windows and Linux) To Validate my port i try to reproduce the results you show on your YouTube video. I stored the solutions i already found here : https://github.com/PascalCorpsman/biosim4_FPC_translation/tree/main/Challanges

Timestamp 7:57 - 14:38 "Challange_1_Right_Half.ini" was no problem and reproducable

Timestamp 27:16 - 33:40 "Challange_13_Left_Right_eights.ini" i also was able to reproduce this (with only 5000 - 7000 generations)

Timestamp 35:54 - 40:52 Brain sizes

This is where i got into trouble, when having 32 or more inner neurons the individuals start like expected and after a view hundred generations they start to not go to the outside walls anymore but keep a little bit more in the middle and then the surviver rate always drops below 50% and stays there ( even the 8 Inner neuron version does this, but the effekt is not that hard).

So can you please share your .ini files so that i can try to reproduce your results with my simulator ?

Regards Uwe Schächterle

davidrmiller commented 2 years ago

Hi @PascalCorpsman , that was quite a porting effort! I'm impressed you got it working so well.

Unfortunately, I no longer have a copy of the exact code or the parameters I used for the scenarios seen in the video. At that time, the program was in constant development. I recall that I often had results like you described, where certain neural net topologies resulted in poor survival rates. Sometimes a small change in one of the parameters or in the action-evaluation code would make a dramatic change in the survival rate.

At this point in time, we have a nice python-based test suite contributed by @venzen that works with the current repository code. The documentation is in the tests/ subdirectory. I often use that test harness when experimenting with changes. If the floating point library implementations are sufficiently similar, then I would think that the test named "deterministic0" should pass when run against the Pascal port. That test is single-threaded and is configured to use a deterministic "random" number sequence for testing purposes. The other tests in that suite might also pass, but they are multithreaded, and I don't know how that will affect the tests on the Pascal port.

Thanks for letting us know about this port.

PascalCorpsman commented 2 years ago

Ok, if i get this right, the "test" changes the default biosim.ini (your project provides) according to the test spezifications. Then runs the simulation and compares the definied results.

In the case of [deterministic0] the results should be: result-generations = 1000 result-survivors-min = 251 result-survivors-max = 251 result-diversity-min = 0.080 result-diversity-max = 0.081 result-genomesize = 8 result-kills = 0

My simulator are: result-generations = 1000 result-survivors = 227 result-diversity = 0.0154 result-genomesize = 8 result-kills = 0

So i would say the test fails, if i also look at the resulting .avi file it seems that the "individuals" again form strange patterns.

Until now i was not able to compile or run your code, i did this all by looking at your code with a simple text editor. But i think i should be able to run, maybe debug your code to be able to compare it in detail and see where's the different. This gives me the new task of setting up a device that is capable of doing this.

I attached the last Image of the last generation for comparison :)

gen1000_simstep_300

davidrmiller commented 2 years ago

We're getting similar results (I'm impressed how much you got working). A different random number sequence could cause slight differences, but it appears that you implemented the same random number algorithm. In "deterministic" mode, it should produce the same pseudo-random sequence. These results appear to be due to something else.

When I was debugging the C++ version, I found that the test functions in unitTestBasicTypes.cpp were extremely valuable. If any of those test cases failed, then the rest of the code execution would be unpredictable.

PascalCorpsman commented 2 years ago

it took a bit, but finally i got your docker file to work, i had to disable the avi encoding as the docker image is not capable of generating h256 videos, also png is not supported, so i changed this to .bmp and then was able to compile your code and validate that your code reaches the deterministic0 testcase.

i ran unitTestConnectNeuralNetWiringFromGenome(); unitTestGridVisitNeighborhood(); unitTestBasicTypes();

in both code bases yours and mine, here are the Results:

unitTestConnectNeuralNetWiringFromGenome(); -> Your code gives no results, mine plotted SENSOR 0 -> NEURON 0 at 0 SENSOR 1 -> NEURON 2 at 2.199951172 SENSOR 13 -> NEURON 0 at 3.299926758 NEURON 1 -> NEURON 2 at -3.600097656 NEURON 1 -> NEURON 1 at -2.5 NEURON 2 -> NEURON 0 at -1.400024414 NEURON 0 -> NEURON 0 at -0.3000488281 NEURON 2 -> NEURON 0 at 0.7999267578 SENSOR 0 -> ACTION 1 at 1.899902344 SENSOR 2 -> ACTION 12 at 2.099975586 NEURON 0 -> ACTION 1 at 3 NEURON 1 -> ACTION 2 at -4 => your testcode here seems to be old as it uses float weights where ints should stand, so i changed this in both yours and mine Both plott now: SENSOR 0 -> NEURON 0 at 0 SENSOR 1 -> NEURON 2 at 2 SENSOR 13 -> NEURON 0 at 3 NEURON 1 -> NEURON 2 at 4 NEURON 1 -> NEURON 1 at 5 NEURON 2 -> NEURON 0 at 6 NEURON 0 -> NEURON 0 at 7 NEURON 2 -> NEURON 0 at 8 SENSOR 0 -> ACTION 1 at 9 SENSOR 2 -> ACTION 12 at 10 NEURON 0 -> ACTION 1 at 11 NEURON 1 -> ACTION 2 at 12 -> Pass

unitTestGridVisitNeighborhood(); -> not pass Due to rounding errors my version "overfits" the circles after a vew changes now the code also passes with the exact same coords as yours -> Pass

unitTestBasicTypes(); The simulator did not use polar type so i skipped this tests, all the others run without any harming -> pass

But Running the deterministic0 testcase still fails :(

Next thing i will try, your simulation plotts a genome at the end of the simulation, my simulation is capable of reading this genomes in i am curious what will happen if i use this results, but to be able to get any usefull results i need first to run your simulation until the diversity has dropped to near 0 so this will take a while g. But i will keep you informed about the results ...

PascalCorpsman commented 2 years ago

OK now i am confused. Below you see the Plot of a individual of your simulator. I added behind the "->" the hex value of the weight. The mapping of the first 16-Bit i am ignoring right now, but i expect the value of the weight to be found in the genomes anywhere and this is not the case why ? Even if i search bit reversed i am not able to find your values


Individual ID 3
570eb37a ce33ed60 03d3177b 960c49ca 750f0f55 6bcecf8f 443a6099 39f4436f

Osc N0 17466   -> 0x443A
N1 MvE 22286   -> 0x570E
N0 LPD -12749  -> 0xCE33
Sfd MvN 979    -> 0x03D3 -> %0000 0011 1101 0011 -> %1100 1011 1100 0000 -> 0xCBC0
N2 Mrn -27124  -> 0x960C
Lx Res 29967   -> 0x750F
LPf Res 27598  -> 0x6BCE
Osc MvY 14836  -> 0x39F4

Doing the same thing with my simulator leeds to this result:

Individual ID 3
570eb37a ce33ed60 03d3177b 960c49ca 750f0f55 6bcecf8f 443a6099 39f4436f

N0 N1 -19590   -> 0xB37A
Bfd N0 -4768   -> 0xED60
Ly N0 18890    -> 0x49CA
N0 N0 3925     -> 0x0F55
N0 MRL 6011    -> 0x177B
N1 MvR -12401  -> 0xCF8F
N0 Mrn 17263   -> 0x436F

The Genome 443A6099 will be dropped by my simulator, at the moment i don't know why, but all the other weights can be found in the orig genome if you compare the results ..

So the next step will be to figure out why this is the case ..

davidrmiller commented 2 years ago

It's good news that the unit tests are giving us the same results. Looking forward to what you discover.

PascalCorpsman commented 2 years ago

Ok i got a new approach.

Debugging your multi thread docker app is to difficult for me (not to say impossible) so i searched for a way to reach debug information's and came up with this :).

first i took your code and dropped everything out that has to do with image processing and threads. This gives me a pure c++ project which is easily to be compiled with a simple makefile (using VSCode).

I run this with the deterministic0.ini and expected it to produce the same results as your orig application does. -> no pass, the number of survivers match, but the diversity was way much better than yours.

by the way, giving up is not an option!

So i started thinking of what i dropped out and what now could be the difference. As i use your code and only changed a few dozens lines the "Bug" can not be that big. The Solution was brought by the documentation of omp_get_thread_num, i assumed that this is the number of threads, but it is not. It is a single number for each thread starting with 0. And this was my difference, i set it to 1 as i expected it the number of active threads not their index (classical off by one ;) ).

Fixed that, run my "simple port". As i am now "blind" now, i am only available to compare the epoch-log.txt files. So i asked meld to show the difference of your simulations logfile and mine. And finally they are the same (y).

Now i have a simple C++, single Thread application, which is debug-able by VSCode. Now i have everything needed to start a detailed debug run step by step to hopefully find the Bug / difference in my FPC code.

Suggesting that this will take a while, i will stay you informed g.

PascalCorpsman commented 2 years ago

So i got my first finding, but not as expected in my code, it is in yours, to me it looks like a "Bug" but i am not shure, so i show it to you ;)

Finding1

To reproduce set a breakpoint where i did in the line ( in genome.cpp on the shown line " nnet.neurons.clear();") and step by step through the loop (using the deterministic0 testcase ) [ luckiliy this is the first time the code runs so it is really only setting the breakpoint and seeing what happens ..)

i stepped through the code (left = Red) is before the run through the loop, as you can see i would expect 2 neurons to be created, but the loop iterator starts with index 0 and therefore requests a not existing entry in the nodeMap, this results that the nodeMap creates a new entry 0 (right part of the image) with all zeros. => This results in 3 neurons beeing created where there should only be 2. As the number of neurons affect absolutely everything the indiv does, it is clear why our code behaves different -> now the fun part begins and i will try to "replicate" this behavior with my code..

PascalCorpsman commented 2 years ago

Ok this one is much harder, Finding2 i run the simulation step by step and as you can see the C++ tanh function gives a slightly other result than the fpc version. One could think that this is no Problem, but unfortunatunelly it is. This tiny difference let the indiv not move down, and therfore it will not be able to reproduce :( So next challenge is to include a tanh variant into my code which gives exact the same values as the c++ version in all cases..

davidrmiller commented 2 years ago

Thanks for looking at the code in Indiv::createWiringFromGenome(). It's a complicated function -- it converts a list of genes into a list of neurons (a.k.a. nodes), then removes those with no connections, then renumbers the remaining neurons, then converts them into a list of ordered connections. I don't think the code references containers out of bounds, but there could be something else amiss there.

About the tanh() differences, it looks like our floating point libraries are the same to 6 or 7 decimal digits of accuracy. I would expect that to cause only occasional very small difference in results.

PascalCorpsman commented 2 years ago

Finally i made it. First i wrote a trace program that took around 200k datapoints while running the orig C-Code. Than i wrote a "tolerant" (up to 0.0015 in float values) trace comparer for my FPC-Version that runs the code and compares the trace log on the same control points as the trace took them of the C-Code version. This finally brought up all differences and made me able to get the final image you see below ;)

Here are some of my conclusions i want to share to you:

My Bugs / things that i had to fix to get it working (in case someone ever tries to do the same as i did and reads this post with more or less the same errors).

And here it is the main bug i did: In spawnNewGeneration, all the parent candidats will be sorted by their fitness, as FPC does not have its own sorting method i implemented a Quicksort algorithm by hand. My mistake was that the sort result was not best to worst, but worst to best -> This results that instead of letting the "best fit" genoms beeing parents, always the "worst-fit" ones where choosen. => This only affects on challanges where the Fitness is weighted, so my first tests passed, as they are not weighted. But the really cool thing is (in my opinion), that even that i implemented the wrong sorting and therefore the worst genes survived, the results from my initial posting are not that bad (y) nature is cool.

-\ -/ Unfortunatelly my sorting algorithm of the parents also not gives the same results as the C++ version, this is because if indiv 1 and 2 have the same fitness. For the sorting result they are of same value and therefore the "order" of all indivs with the same fitness value is random (and always different to the C++ version). This always results in different parents choosen for the next generation, which is relevant in deterministic mode and makes FPC-Version none compareable to the C-Version.


After all this said and done i also revisited the above mentioned point of the Mapping function thing.

Ok this one is much harder, Finding2 i run the simulation step by step and as you can see the C++ tanh function gives a slightly other result than the fpc version. One could think that this is no Problem, but unfortunatunelly it is. This tiny difference let the indiv not move down, and therfore it will not be able to reproduce :( So next challenge is to include a tanh variant into my code which gives exact the same values as the c++ version in all cases..

As my code now runs smoothly i was able to test with my fix and without my fix. The results are as following: Both codes work. But the in my opinion "wrong" version gives slightly better results (higher survivor rate). My conclusion to this is: The "Wrong" version creates more neurons than the other version, and as you already figured out in your youtube video, more neurons = better survivor rates. So even though that i don't like this "feature" it gives the creatures a better surviver rate and as the upperlimit of created neuros is always p.maxNumberNeurons it is within the ruleset.


As soon as i cleaned up all my code mess i will push my changes to my github repository with the "final" results.

Thank you very much for helping me and ginving me the right hints for solving this issue.

And now here is my image300 of generation 1000 of deterministic0 testrun (y)

Final_Working

davidrmiller commented 2 years ago

Congratulations on the progress!

PascalCorpsman commented 2 years ago

Topic is done, don't want to pollute your open Issue list ;)