Solvers and Models - Githubissues

FriesischScott commented 5 years ago

Regarding the overall structure I think what we should do is something like this:

A model connects several inputs to one or multiple solvers?.
The solvers are executed in order. The most basic solver being a julia function that you pass in. Somewhat like the Mio we have in OpenCossan.
Everything just passes around DataFrames. A Solver would just run whatever it needs to and adds the results to the DataFrame of the inputs. The end result would then collect all inputs and output values for whatever one might want to do with them afterwards.

(Pseudo-)Example:

a = 5
b = Normal()

function add (df::DataFrame) 
    df.c = df.a .+ df.b
    return df
end

solver = Solver(add)

model = Model([a b], ["a" "b"], [solver])

result = run(model, samples=1000)

Something like this, only more complex :thinking:.

This could then be extended to include ObjectiveFunctions for reliability analysis or advanced sampling algorithms, etc.

FriesischScott commented 5 years ago

Based on the latest discussion the cantilever example could look like this:


using UncertaintyQuantification, Distributions, DataFrames

import LinearAlgebra.I

nmc = 10000

l = 1.8
b = 0.12
max_displacement = 0.01

h = Normal()
P = LogNormal()
rho = LogNormal()
E = LogNormal()

corr = Matrix{Float64}(I, 4, 4)
corr[3,2] = 0.8
corr[2,3] = 0.8

rvset = RandomVariableSet([h P rho E], ["h" "P" "rho" "E"], corr)

function inertia(df::DataFrame)
    df.I = df.b .* df.h.^3 / 12
    return df
end

function displacement(df::DataFrame)
    df.w = (df.rho .* 9.81 .* df.b .* df.h .* df.l.^4) ./ (8 .* df.E .* df.I) 
    .+ (df.P .* df.l.^3) ./ (3 .* df.E .* df.I)
    return df
end

solver = MonteCarlo()

model = Model([l b rvset inertia], displacement, solver)

pf = probability_of_failure(model, df -> df.w - max_displacement, nmc)

println("Probability of failure: ", pf)

@AnderGray Thoughts?

FriesischScott commented 5 years ago

Does it make sense to return the DataFrame from inertia and displacement? Do you think it would be cleaner if the functions actually just return their results and the model takes care of merging them into the DataFrame ?

AnderGray commented 5 years ago

Looks good. Exactly, it would be much simpler if we didn't have to do the DataFrame stuff on this end. Also it makes it difficult to see what the inputs to displacement and inertia are. Could we take care of this inside Model?

AnderGray commented 5 years ago

Also we haven't given a name to l and b. Do you know if it's possible to extract the names from the variables themselves? So that the name of x is alway "x". If this could be done automatically, we may not have to pass in the vector of names to the rvset, model or input if we have one

AnderGray commented 5 years ago

In the matlab version, I think solver is an object right? Does this make much sense?

Maybe solver should just be a function that we give model as an input. And for every different simulation type we have a different function. Maybe something like:

model = Model([l b rvset inertia], displacement, solver)

output = MonteCarlo(model, N=1000)

output2 = ImportanceSampling(model, N=1000, etc...)

Each would produce a different copy of the DataFrame

FriesischScott commented 5 years ago

In the matlab version, I think solver is an object right? Does this make much sense?

Maybe solver should just be a function that we give model as an input. And for every different simulation type we have a different function. Maybe something like:
model = Model([l b rvset inertia], displacement, solver)

output = MonteCarlo(model, N=1000)

output2 = ImportanceSampling(model, N=1000, etc...)
Each would produce a different copy of the DataFrame

I think having objects for the solver gives the most flexibility. But maybe it makes more sense to pass them to the actual analysis that is supposed to be done. Like

model = Model(displacement, inputs=[l b rvset inertia])

pf = failure_probability(model, df -> df.w - max_displacement, solver=MonteCarlo(1000))

Then we could make use of Julia's multiple dispatch to get the probability of failure using different algorithms.

FriesischScott commented 5 years ago

Also we haven't given a name to l and b. Do you know if it's possible to extract the names from the variables themselves? So that the name of x is alway "x". If this could be done automatically, we may not have to pass in the vector of names to the rvset, model or input if we have one

I don't think that will work, as the compiler probably does not preserve variable names.

AnderGray commented 5 years ago

Hmm ok. The DataFrame is the reason we need names correct?

What are the properties of solver ?

Does it make sense to have MonteCarlo as an object. It's a method and shouldn't store any information. All the need information from the simulation should be passed out as a DataFrame

AnderGray commented 5 years ago

I think having objects for the solver gives the most flexibility. But maybe it makes more sense to pass them to the actual analysis that is supposed to be done. Like
model = Model(displacement, inputs=[l b rvset inertia])

pf = failure_probability(model, df -> df.w - max_displacement, solver=MonteCarlo(1000))
Then we could make use of Julia's multiple dispatch to get the probability of failure using different algorithms.

Would failure_probability not be a post analysis step, something that comes after the uncertainty propagation? And so should be a method called after MonteCarlo, and that uses its output.

I think you're more familiar with reliability methods than me. I guess for example subset simulation constantly uses models output in a layered MC simulation? In which case it would make sense for the solver=SubSet(1000) to be passed to the object.

FriesischScott commented 5 years ago

Hmm ok. The DataFrame is the reason we need names correct?

What are the properties of solver ?

Does it make sense to have MonteCarlo as an object. It's a method and shouldn't store any information. All the need information from the simulation should be passed out as a DataFrame

I think it might be best to have objects for the solvers to group their individual settings. The advanced algorithms require more than just the number of samples.

FriesischScott commented 5 years ago

I think having objects for the solver gives the most flexibility. But maybe it makes more sense to pass them to the actual analysis that is supposed to be done. Like
model = Model(displacement, inputs=[l b rvset inertia])

pf = failure_probability(model, df -> df.w - max_displacement, solver=MonteCarlo(1000))
Then we could make use of Julia's multiple dispatch to get the probability of failure using different algorithms.
Would failure_probability not be a post analysis step, something that comes after the uncertainty propagation? And so should be a method called after MonteCarlo, and that uses its output.

I think you're more familiar with reliability methods than me. I guess for example subset simulation constantly uses models output in a layered MC simulation? In which case it would make sense for the solver=SubSet(1000) to be passed to the object.

Should the uncertainty propagation itself be a top level function that is run individually in the analysis script? Can you give an example how that would look like?

FriesischScott commented 5 years ago

Alternatively we could wrap everything in custom objects after all and give everything a name property. This would remove the need to keep passing names around as every input already has a name associated.

This would make that kind of clean: https://github.com/rafaqz/Mixers.jl

Last commit was 7 months ago though.

FriesischScott / UncertaintyQuantification.jl

Solvers and Models #3

(Pseudo-)Example: