idaholab / raven

RAVEN is a flexible and multi-purpose probabilistic risk analysis, validation and uncertainty quantification, parameter optimization, model reduction and data knowledge-discovering framework.
https://raven.inl.gov/
Apache License 2.0
218 stars 133 forks source link

Fix defect ensemble model (with Code) and genetic algorithm #2317

Closed alfoa closed 4 months ago

alfoa commented 5 months ago

Pull Request Description

What issue does this change request address? (Use "#" before the issue to link it, i.e., #42.)

Closes #2304

What are the significant changes in functionality due to this change request?

The EnsembleModel (with code) uses a different parallelization strategy. The batching mode has been enabled in that strategy (parallelMode ==2)

I also created an issue #2318 to better re-design the batching system


For Change Control Board: Change Request Review

The following review must be completed by an authorized member of the Change Control Board.

moosebuild commented 4 months ago

Job Test mac on 54f43a4 : invalidated by @wangcj05

RuntimeError: Found incorrect download: libaec. Aborting

alfoa commented 4 months ago

The changes are good. @alfoa Could you document how you define job '' prefix'' and 'uniqueHandler' inside Ensemble? This will help for future rework on the job submission function.

I can add the info in the Issue I created?

moosebuild commented 4 months ago

Job Test qsubs sawtooth on 54f43a4 : invalidated by @wangcj05

unable to remove libraries

moosebuild commented 4 months ago

Job Test qsubs sawtooth on 54f43a4 : invalidated by @alfoa

fetch failure

wangcj05 commented 4 months ago

The pylint complains the docstring:

ravenframework/Models/EnsembleModel.py:1:0: C0114: Missing module docstring (missing-module-docstring)

It seems you have removed the module docstring in the EnsembleModel, could you update it? @alfoa

alfoa commented 4 months ago

The pylint complains the docstring:

ravenframework/Models/EnsembleModel.py:1:0: C0114: Missing module docstring (missing-module-docstring)

It seems you have removed the module docstring in the EnsembleModel, could you update it? @alfoa

@wangcj05 done

moosebuild commented 4 months ago

Job Test Ubuntu 20-2 Optional on a522493 : invalidated by @wangcj05

failed at fetch

moosebuild commented 4 months ago

Job Test qsubs sawtooth on a522493 : invalidated by @alfoa

Set python environment taking 6+ hrs?

alfoa commented 4 months ago

Job Test qsubs sawtooth on a522493 : invalidated by @alfoa

Set python environment taking 6+ hrs?

@wangcj05 There is a problem with HPC testing (set up enviroment) that is not related to this PR. (It does not seem to be a spurious problem since I invalidated the job multiple times and it gets stuck at the same step)

alfoa commented 4 months ago

@wangcj05 @joshua-cogliati-inl It seems that now in devel/main there is a cascade of failures (https://civet.inl.gov/branch/2903/) after merge of PR #2309 (I dnk if it is related...probably it is not)

alfoa commented 4 months ago

@wangcj05 now it fails because cvxpy is missing?

moosebuild commented 4 months ago

Job Mingw Test on 7ff7659 : invalidated by @alfoa

alfoa commented 4 months ago

@wangcj05 the Mingw Test are failing and I don't see a correlation with this MR.

moosebuild commented 4 months ago

Job Mingw Test on 7ff7659 : invalidated by @alfoa

joshua-cogliati-inl commented 4 months ago

@wangcj05 the Mingw Test are failing and I don't see a correlation with this MR.

How many tests are failing? (And if only a few, which ones?) Thanks.

wangcj05 commented 4 months ago

@wangcj05 the Mingw Test are failing and I don't see a correlation with this MR.

How many tests are failing? (And if only a few, which ones?) Thanks.

FYI, there are around 33 tests failed in previous MingW test. Many for optimizations, such as simulated annealing. I have checked the new tests right now, it seems the runs are ok for now. Let's see if the test on Mingw can pass or not. @joshua-cogliati-inl

alfoa commented 4 months ago

@wangcj05 the Mingw Test are failing and I don't see a correlation with this MR.

How many tests are failing? (And if only a few, which ones?) Thanks.

FYI, there are around 33 tests failed in previous MingW test. Many for optimizations, such as simulated annealing. I have checked the new tests right now, it seems the runs are ok for now. Let's see if the test on Mingw can pass or not. @joshua-cogliati-inl

@wangcj05 @joshua-cogliati-inl unfortunately they are still failing...now in the TSA, PostProcessors, Datamining etc.

wangcj05 commented 4 months ago

@joshua-cogliati-inl Let's try to test it on devel first. Can you pull the changes for cvxpy and test it on your trivial white space branch?

joshua-cogliati-inl commented 4 months ago

@joshua-cogliati-inl Let's try to test it on devel first. Can you pull the changes for cvxpy and test it on your trivial white space branch?

Hm, which set of changes for cvxpy? d3c7069be99dea3f491e0be1ffa2d2e5bd45ff66 ?

alfoa commented 4 months ago

@wangcj05 the Mingw Test are failing and I don't see a correlation with this MR.

How many tests are failing? (And if only a few, which ones?) Thanks.

FAILED:

Timeout tests\framework\PostProcessors\SubdomainBasicStatistics\subdomainSensitivity

Timeout tests\framework\PostProcessors\SubdomainBasicStatistics\subdomainTimeDepStats

Failed tests\framework\PostProcessors\TemporalDataMiningPostProcessor\Clustering\KMeans

Failed tests\framework\PostProcessors\TemporalDataMiningPostProcessor\Clustering\MiniBatchKMeans

Failed tests\framework\PostProcessors\TemporalDataMiningPostProcessor\Clustering\DBSCAN

Failed tests\framework\PostProcessors\TemporalDataMiningPostProcessor\Clustering\MeanShift

Failed tests\framework\PostProcessors\TemporalDataMiningPostProcessor\Clustering\AffinityPropogation

Failed tests\framework\PostProcessors\TemporalDataMiningPostProcessor\Clustering\SpectralClustering

Failed tests\framework\PostProcessors\TSACharacterizer\Basic

Failed tests\framework\PostProcessors\TopologicalPostProcessor\knn

Timeout tests\framework\PostProcessors\TopologicalPostProcessor\relaxed_beta_skeleton

Failed tests\framework\PostProcessors\TopologicalPostProcessor\beta_skeleton

Failed tests\framework\PostProcessors\TSACharacterizer\RWD

Failed tests\framework\PostProcessors\Validation\test_validation_gate_probabilistic_time_dep

Failed tests\framework\PostProcessors\Validation\test_validation_gate_probabilistic

Timeout tests\framework\PostProcessors\Validation\test_validation_gate_pcm

Diff tests\framework\ROM\FeatureSelection\DMDcRFEScoringSubgroup

Failed tests\framework\ROM\FeatureSelection\DMDcRFEScoringOnlyOutput

Failed tests\framework\ROM\FeatureSelection\DMDcRFEApplyClustering

Failed tests\framework\ROM\FeatureSelection\DMDcRFESubgroupCrossCorrelation

Timeout tests\framework\ROM\FeatureSelection\StaticSklearnVarianceThreshold

Failed tests\framework\ROM\FeatureSelection\DMDcVarianceThreshold

Failed tests\framework\ROM\MSR\cosine

Timeout tests\framework\ROM\MSR\biweight

Failed tests\framework\ROM\MSR\Epanechnikov

Failed tests\framework\ROM\MSR\exponential

Timeout tests\framework\ROM\MSR\logistic

Timeout tests\framework\ROM\MSR\Gaussian

Timeout tests\framework\ROM\MSR\Silverman

Failed tests\framework\ROM\MSR\triangular

Failed tests\framework\ROM\MSR\tricube

Failed tests\framework\ROM\MSR\triweight

Timeout tests\framework\ROM\MSR\uniform
alfoa commented 4 months ago

@joshua-cogliati-inl Let's try to test it on devel first. Can you pull the changes for cvxpy and test it on your trivial white space branch?

Hm, which set of changes for cvxpy? d3c7069 ?

https://github.com/idaholab/raven/pull/2317/commits/7ff765916d50c843a6a4357b206dc219bbda3bd0

joshua-cogliati-inl commented 4 months ago

@joshua-cogliati-inl Let's try to test it on devel first. Can you pull the changes for cvxpy and test it on your trivial white space branch?

Hm, which set of changes for cvxpy? d3c7069 ?

7ff7659

Thanks: https://github.com/idaholab/raven/pull/2113

moosebuild commented 4 months ago

Job Mingw Test on 7ff7659 : invalidated by @alfoa

wangcj05 commented 4 months ago

Changes are good, PR checklist is good. PR can be merged.