GMUEClab / ecj

ECJ Evolutionary Computation Toolkit
http://cs.gmu.edu/~eclab/projects/ecj/
123 stars 42 forks source link

`breedthreads = auto` causes individuals getting lost, and few exceptions #77

Closed ZvikaZ closed 3 years ago

ZvikaZ commented 3 years ago

Hi.

When I add breedthreads = auto, I see that some individuals are getting lost, and there are few exceptions. (evalthreads = auto is working well).

I've encountered this issue before, in a larger project; but it also appears in the small sample project that I've just created: https://github.com/ZvikaZ/ECJ-sample (see https://github.com/GMUEClab/ecj/issues/76). Just run it, and it shortly fails. You can comment the breedthreads = auto in the params file, and see that everything is fine.

BTW, I have encountered this on two different machines, running Linux and Windows.

eclab commented 3 years ago

There are two bugs. What you see is a race condition. Fixing that revealed an underlying architectural issue with GroupedProblemForm which I have to think about, but in the meantime I've patched it with a hack.

Grab the following files from the repository and you should be good to go.

M src/main/java/ec/breed/BufferedBreedingPipeline.java M src/main/java/ec/simple/SimpleEvaluator.java

Sean

On May 23, 2021, at 8:20 AM, Zvika @.***> wrote:

Hi.

When I add breedthreads = auto, I see that some individuals are getting lost, and there are few exceptions. (evalthreads = auto is working well).

I've encountered this issue before, in a larger project; but it also appears in the small sample project that I've just created: https://github.com/ZvikaZ/ECJ-sample (see #76). Just run it, and it shortly fails. You can comment the breedthreads = auto in the params file, and see that everything is fine.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

ZvikaZ commented 3 years ago

Indeed, it seems to solve the problem. Thanks for the quick fix. I got a lot of Yo mama prints, but as far as I understand the code that issues them, it should be harmless. Right?

BTW, it'd be nice to add this scenario to the tests, to avoid similar bug in the future.

SigmaX commented 3 years ago

BTW, it'd be nice to add this scenario to the tests, to avoid similar bug in the future.

Agreed! Your sample project should give us enough detail to create a test. Is it as simple as setting breedthreads = auto and watching the population size for a few generations?

Thanks, Siggy

On Mon, May 24, 2021 at 4:31 AM Zvika @.***> wrote:

Indeed, it seems to solve the problem. Thanks for the quick fix. I got a lot of Yo mama prints, but as far as I understand the code that issues them, it should be harmless. Right?

BTW, it'd be nice to add this scenario to the tests, to avoid similar bug in the future.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/GMUEClab/ecj/issues/77#issuecomment-846877568, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACMOQQJW6E3LXPJCJMH4HLTPIFENANCNFSM45LVA5EA .

--

——

Eric "Siggy" Scott | MITRE

Senior Artificial Intelligence Engineer, MITRE Labs AI & Autonomous Systems Dept.

Doctoral Candidate, George Mason University http://mason.gmu.edu/~escott8/

ZvikaZ commented 3 years ago

BTW, it'd be nice to add this scenario to the tests, to avoid similar bug in the future.

Agreed! Your sample project should give us enough detail to create a test. Is it as simple as setting breedthreads = auto and watching the population size for a few generations? Thanks, Siggy

Yeah. Note that it already contains a checker for this: https://github.com/ZvikaZ/ECJ-sample/blob/124008a98dcfca0c6d3378364f261879409a7409/src/main/java/SampleProblem.java#L22

    var p = state.population.subpops.get(0);
    if (p.initialSize != p.individuals.size()) {
        state.output.fatal("someone got lost!!! (you might want to comment `breedthreads = auto` in the params file)");
    }

Also note, that now my sample project doesn't fail, as it's using a .jar that I've compiled with the fixes suggested in this thread.

ZvikaZ commented 3 years ago

There are two bugs. What you see is a race condition. Fixing that revealed an underlying architectural issue with GroupedProblemForm which I have to think about, but in the meantime I've patched it with a hack.

Grab the following files from the repository and you should be good to go.

M src/main/java/ec/breed/BufferedBreedingPipeline.java M src/main/java/ec/simple/SimpleEvaluator.java

Sean

Hi. Any news with the long-term solution? I think that I've found an issue with the hack, but I'm not sure it's worth reporting, if that hack will be replaced soon...

Thanks

eclab commented 3 years ago

This summer has presented time challenges to fixing bugs ECJ; I'm sorry for the extremely slow turnaround. I hope to get to it soon but may need more reminders. :-) In the meantime, go ahead and report issues with the hack.

Sean

On Aug 18, 2021, at 1:25 PM, Zvika @.***> wrote:

There are two bugs. What you see is a race condition. Fixing that revealed an underlying architectural issue with GroupedProblemForm which I have to think about, but in the meantime I've patched it with a hack.

Grab the following files from the repository and you should be good to go.

M src/main/java/ec/breed/BufferedBreedingPipeline.java M src/main/java/ec/simple/SimpleEvaluator.java

Sean

Hi. Any news with the long-term solution? I think that I've found an issue with the hack, but I'm not sure it's worth reporting, if that hack will be replaced soon...

Thanks

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://secure-web.cisco.com/1PXo6vsOT7PDAe1adY1vPZNo9tb3RzvaB2o2sk-8IGsxhCyBSlVwZSmsSgwObv9Cm30qC1BxO_Wakd4b2k-WbhRc8_AlBigzHcJeaJL-5QvSuFaqyuzgo3coDp1swaQuWXacSZ8H0BgTevHujCA7ToObX8POrjpbG-DM-4SwP9One3M9fU1P-KLMp6cdQXQaFz_SymL089gEniYnfZLvkLcfu4jfuXcVV52K3sTVePxX_FMOLQk4OTOJ07nbuAxfrZLmVSxBG-_3Tys_Sn47RnWKW8dMBOIwYfVL1rTH84y0dv9wqmW4a3jcuaUAULjaRQpmYiPYRKc37bHrvZRWb7u7X51IwvE9Vx-NlvyXAya-848FrU5ci6e1Xv2-XVg9GcR5xG3T8LMnN72KYdp1wE-qCYhsxjzEittQcTQWWSbPX0G9VmtEHTvURUvQYUU7r/https%3A%2F%2Fgithub.com%2FGMUEClab%2Fecj%2Fissues%2F77%23issuecomment-901294415, or unsubscribe https://secure-web.cisco.com/1DsFBwffdgdzrSU3DJ-Fj7jJfY3o3A_kGYX5jdnRWN5XC0P8ot34EjUlG46jj8bJzlnfp9bsSctuGOOYD863dZLmLJxFBzAAOpsnF4IcsqPQnEbL5YN7o1htIcRYd1ie3vMtjcO6q7w-T5aK1M3nJqPmpHokyNTOIs1fgRIsd9hyedOgpF1u1dWwvLs5B2Fv!%20_hBIFuvLteeRa-V1KstS4K9Y59Q3H0161susvnE36xPqmvylwoIOdyMiphHr1dnBJVxOs3b4p5fGg7i42umGdxV9QaqYA5BaOBDNVxIxiXoO3Zbsylh7LkP7GjcJkmLIZBOnylyky6PRNOD201zJCT3l6HSi1B0Q_XqcFMfU_wIfqWCXcCUT60eMGlo62SrBOs3hNZgPdulfb8gZ2Zva0XHrSVIpDeYvVZSBtl-hNlWLL735KrU00ki2ZfzTBsci-/https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FADAZDVEISC2J57MKVTESQW3T5PUKPANCNFSM45LVA5EA. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email.

ZvikaZ commented 3 years ago

Great. I didn't get an answer for my 2 PRs since May (and I have reminded in https://github.com/GMUEClab/ecj/pull/79 few days ago), so I was afraid that the project has gone sleeping :-)

We're all busy, it's OK. I didn't want to be a nudge, so I hesitated with this.

It's a little bit related, since I'm working with a local version with those 2 PRs integrated. Thus, if you think that they will be reviewed soon, I prefer to report the details of the issue after that, so we will have the same version. Otherwise, I will run with the official master, to eliminate such influences.

eclab commented 3 years ago

I've decided that the hack is in fact probably good enough for long-term. I modified it slightly (hope I didn't break anything) and have committed it with a bit of documentation as well. Reopen the issue if things aren't working right.