idaholab / raven

RAVEN is a flexible and multi-purpose probabilistic risk analysis, validation and uncertainty quantification, parameter optimization, model reduction and data knowledge-discovering framework.
https://raven.inl.gov/
Apache License 2.0
218 stars 133 forks source link

[DEFECT] Genetic Algorithm and Simulating Annealing Explicit Constraints initial points #2049

Open aalfonsi opened 1 year ago

aalfonsi commented 1 year ago

Thank you for the defect report

Defect Description

Genetic algorithms and Simulating Annealing do not check for Explicit Constraints for the initial points (e.g. initial Population coming from a monte carlo).

Steps to Reproduce

Make an explicit constraint always returning false with such algorithms

Expected Behavior

If I use a Monte Carlo for example to initialize an initial population, the coordinates are checked (e.g. using a rejection in the MC sampler or directly in the GA

Screenshots and Input Files

No response

OS

Linux

OS Version

No response

Dependency Manager

PIP

For Change Control Board: Issue Review

For Change Control Board: Issue Closure

aalfonsi commented 1 year ago

@mandd @Jimmy-INL

mandd commented 1 year ago

Possible idea: generate initial population and post-process it to filter out elements that do no satisfy criteria and then pass the dataobject to the optimizer

mandd commented 1 year ago

Another idea: generate the initial population through an adaptive sampling step

aalfonsi commented 1 year ago

Possible idea: generate initial population and post-process it to filter out elements that do no satisfy criteria and then pass the dataobject to the optimizer

I used this but this requires to generate a large population from, for example, a Monte Carlo sampling. Then either create a python script to filter the points out to create a CSV to set the initial population or use a PostProcessor that labels the output and then filter out with clustering the "valid" points.

wangcj05 commented 1 year ago

Another idea: generate the initial population through an adaptive sampling step

Following this idea, I think we can enable Monte Carlo Sampling to handle explicit constraints directly, in which case you can avoid the simulations runs while there are violations in explicit constraint.

Jimmy-INL commented 1 year ago

I am not sure if I follow for GA:

if self._constraintFunctions or self._impConstraintFunctions:
      params = []
      for y in (self._constraintFunctions + self._impConstraintFunctions):
        params += y.parameterNames()
      for p in list(set(params) -set([self._objectiveVar]) -set(list(self.toBeSampled.keys()))):
        constraintData[p] = list(np.atleast_1d(rlz[p].data))
    # Compute constraint function g_j(x) for all constraints (j = 1 .. J)
    # and all x's (individuals) in the population
    g0 = np.zeros((np.shape(offSprings)[0],len(self._constraintFunctions)+len(self._impConstraintFunctions)))
    g = xr.DataArray(g0,
                     dims=['chromosome','Constraint'],
                     coords={'chromosome':np.arange(np.shape(offSprings)[0]),
                             'Constraint':[y.name for y in (self._constraintFunctions + self._impConstraintFunctions)]})

doesn't this check the constraints for the first realization (initial guess)? If you mean a check before the evaluation, we do not do that for GA, because we never reject.

aalfonsi commented 1 year ago

I am not sure if I follow for GA:

if self._constraintFunctions or self._impConstraintFunctions:
      params = []
      for y in (self._constraintFunctions + self._impConstraintFunctions):
        params += y.parameterNames()
      for p in list(set(params) -set([self._objectiveVar]) -set(list(self.toBeSampled.keys()))):
        constraintData[p] = list(np.atleast_1d(rlz[p].data))
    # Compute constraint function g_j(x) for all constraints (j = 1 .. J)
    # and all x's (individuals) in the population
    g0 = np.zeros((np.shape(offSprings)[0],len(self._constraintFunctions)+len(self._impConstraintFunctions)))
    g = xr.DataArray(g0,
                     dims=['chromosome','Constraint'],
                     coords={'chromosome':np.arange(np.shape(offSprings)[0]),
                             'Constraint':[y.name for y in (self._constraintFunctions + self._impConstraintFunctions)]})

doesn't this check the constraints for the first realization (initial guess)? If you mean a check before the evaluation, we do not do that for GA, because we never reject.

Yes it does check the constraints but after the first iteration. Indeed this implementation is in def _useRealization(self, info, rlz): that is run once the first batch (initial population) is executed. (Not at very begin of the simulation)

aalfonsi commented 1 year ago

I am not sure if I follow for GA:

if self._constraintFunctions or self._impConstraintFunctions:
      params = []
      for y in (self._constraintFunctions + self._impConstraintFunctions):
        params += y.parameterNames()
      for p in list(set(params) -set([self._objectiveVar]) -set(list(self.toBeSampled.keys()))):
        constraintData[p] = list(np.atleast_1d(rlz[p].data))
    # Compute constraint function g_j(x) for all constraints (j = 1 .. J)
    # and all x's (individuals) in the population
    g0 = np.zeros((np.shape(offSprings)[0],len(self._constraintFunctions)+len(self._impConstraintFunctions)))
    g = xr.DataArray(g0,
                     dims=['chromosome','Constraint'],
                     coords={'chromosome':np.arange(np.shape(offSprings)[0]),
                             'Constraint':[y.name for y in (self._constraintFunctions + self._impConstraintFunctions)]})

If you mean a check before the evaluation, we do not do that for GA, because we never reject.

This is okay but if the violation of explicit constraints makes the underlying model to fail (E.g. to sample in not-phsyical regions) you get failures from the underlying code. And this will cause a failure in the GA since the initial valid population would be smaller then the required population size (I guess?)?

Jimmy-INL commented 1 year ago

Yes, that's right. We discussed before that GA can't handle failed cases. I think we can make the sampler rerun failed cases in the initial population.

wangcj05 commented 1 year ago

@mandd @Jimmy-INL @JunyungKim I have assigned this issue to you. Please let me know if you need further discussion.

alfoa commented 5 months ago

@mandd @Jimmy-INL @JunyungKim is this something you are planning to work on in the near future?

Jimmy-INL commented 5 months ago

Will Jump on it as soon as I am back.


From: Andrea Alfonsi - NuCube @.> Sent: Friday, April 19, 2024 12:50:23 PM To: idaholab/raven @.> Cc: Mohammad G. Abdo @.>; Mention @.> Subject: [EXTERNAL] Re: [idaholab/raven] [DEFECT] Genetic Algorithm and Simulating Annealing Explicit Constraints initial points (Issue #2049)

@manddhttps://github.com/mandd @Jimmy-INLhttps://github.com/Jimmy-INL @JunyungKimhttps://github.com/JunyungKim is this something you are planning to work on in the near future?

— Reply to this email directly, view it on GitHubhttps://github.com/idaholab/raven/issues/2049#issuecomment-2067116486, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AMP5ECVIFRVPWSXKSJYUXFDY6FRO7AVCNFSM6AAAAAAUMYZXLCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANRXGEYTMNBYGY. You are receiving this because you were mentioned.Message ID: @.***>