UT-CHG / BET

Python package for data-consistent stochastic inverse and forward problems.
http://ut-chg.github.io/BET
Other
11 stars 21 forks source link

Sample-based methods #369

Closed mathematicalmichael closed 4 years ago

mathematicalmichael commented 4 years ago

See https://github.com/mathematicalmichael/BET/pull/30 Squash merge

This reverts commit 4db634773ff61b971af199e3acf623d6e4cb59de.

This reverts commit 40a7267c0f994dbf52533017f3e4968ef001216e.

This reverts commit 862f70b87310339027dd5aec97449996d9121a41.

This reverts commit 845e77f3f4ce81857e69b1cdd293993c031ccfa5.

codecov[bot] commented 4 years ago

Codecov Report

:exclamation: No coverage uploaded for pull request base (develop@2eff2b8). Click here to learn what that means. The diff coverage is 77.01%.

Impacted file tree graph

@@            Coverage Diff             @@
##             develop     #369   +/-   ##
==========================================
  Coverage           ?   79.82%           
==========================================
  Files              ?       23           
  Lines              ?     5339           
  Branches           ?        0           
==========================================
  Hits               ?     4262           
  Misses             ?     1077           
  Partials           ?        0
Impacted Files Coverage Δ
bet/util.py 71.6% <ø> (ø)
bet/postProcess/compareP.py 95.41% <100%> (ø)
bet/calculateP/simpleFunP.py 67.45% <100%> (ø)
bet/sampling/adaptiveSampling.py 83.87% <50%> (ø)
bet/sampling/basicSampling.py 80.95% <72.22%> (ø)
bet/sample.py 79.58% <77.24%> (ø)
bet/calculateP/calculateP.py 88.27% <85.71%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 2eff2b8...bfa9c34. Read the comment docs.

mathematicalmichael commented 4 years ago

here is a docker image that builds my sample branch (which is currently in-line with my develop) from source on top of fenics: docker pull mathematicalmichael/python:thesis (2.03GB)

smattis commented 4 years ago

@mathematicalmichael FYI. I'm still trying to wrap my head around everything that is happening here, but will try to be done with a full initial review tomorrow.

smattis commented 4 years ago

There needs to be a pretty major restructuring. Right now all of the data consistent and iterated stuff is included directly in sample.py. That module is for defining the basic data structures of sample_set and discretization. The methods of defining distributions on these objects should be here.

However, right now all of the data-consistent and iterated algorithms are part of the discretization object here. These should be in their own modules in the calculateP directory and take discretization objects as inputs. That is what we have done with the other methods. sample.py is way too hard to deal with in this form, and it is not how it is intended to be used. This restructuring will also have a big effect on the tests as well.

mathematicalmichael commented 4 years ago

There needs to be a pretty major restructuring. Right now all of the data consistent and iterated stuff is included directly in sample.py. That module is for defining the basic data structures of sample_set and discretization. The methods of defining distributions on these objects should be here.

However, right now all of the data-consistent and iterated algorithms are part of the discretization object here. These should be in their own modules in the calculateP directory and take discretization objects as inputs. That is what we have done with the other methods. sample.py is way too hard to deal with in this form, and it is not how it is intended to be used. This restructuring will also have a big effect on the tests as well.

I do understand what you're saying, and I made the decision to include solution methods here purposefully. Especially when it came to evaluating the updated density, this architecture dramatically simplifies things. This makes iteration and defining observed densities part of the attributes that belong to the discretization object, rather than having a separate method act on them and adding attributes to the class. When I sketched out how it could possibly work through calculateP, it was far less flexible. It's been a while since I made those design decisions, I'll admit... but I definitely struggled through them before going this route. It was my second go at it, afterall. I more/less completely refactored consistentbayes in making these changes, and that repo did actually segment out the solver into a separate module much like you're proposing.

Do you really think it's that abysmal a change?

smattis commented 4 years ago

There needs to be a pretty major restructuring. Right now all of the data consistent and iterated stuff is included directly in sample.py. That module is for defining the basic data structures of sample_set and discretization. The methods of defining distributions on these objects should be here. However, right now all of the data-consistent and iterated algorithms are part of the discretization object here. These should be in their own modules in the calculateP directory and take discretization objects as inputs. That is what we have done with the other methods. sample.py is way too hard to deal with in this form, and it is not how it is intended to be used. This restructuring will also have a big effect on the tests as well.

I do understand what you're saying, and I made the decision to include solution methods here purposefully. Especially when it came to evaluating the updated density, this architecture dramatically simplifies things. This makes iteration and defining observed densities part of the attributes that belong to the discretization object, rather than having a separate method act on them and adding attributes to the class. When I sketched out how it could possibly work through calculateP, it was far less flexible. It's been a while since I made those design decisions, I'll admit... but I definitely struggled through them before going this route. It was my second go at it, afterall. I more/less completely refactored consistentbayes in making these changes, and that repo did actually segment out the solver into a separate module much like you're proposing.

Do you really think it's that abysmal a change?

I think including everything in the discretization object in sample.py kind of goes against the object-oriented framework that we are using. Why not use the "old" discretization be a base class that all of this new stuff inherits from. So, I'm not saying you have to put everything in functions that act on a discretization object, but in the very least most of this could go in another file that builds upon the discretization base class defined here. Right now, opening up the sample.py file is incredibly daunting. There's simply too much in this file.

smattis commented 4 years ago

@mathematicalmichael I am getting on an airplane soon and will be in Austin working with Troy and Clint this week. If we can arrange a call sometime then that would be good. If you look at my above comment, it might not be as much restructuring as you are assuming.

mathematicalmichael commented 4 years ago

There needs to be a pretty major restructuring. Right now all of the data consistent and iterated stuff is included directly in sample.py. That module is for defining the basic data structures of sample_set and discretization. The methods of defining distributions on these objects should be here. However, right now all of the data-consistent and iterated algorithms are part of the discretization object here. These should be in their own modules in the calculateP directory and take discretization objects as inputs. That is what we have done with the other methods. sample.py is way too hard to deal with in this form, and it is not how it is intended to be used. This restructuring will also have a big effect on the tests as well.

I do understand what you're saying, and I made the decision to include solution methods here purposefully. Especially when it came to evaluating the updated density, this architecture dramatically simplifies things. This makes iteration and defining observed densities part of the attributes that belong to the discretization object, rather than having a separate method act on them and adding attributes to the class. When I sketched out how it could possibly work through calculateP, it was far less flexible. It's been a while since I made those design decisions, I'll admit... but I definitely struggled through them before going this route. It was my second go at it, afterall. I more/less completely refactored consistentbayes in making these changes, and that repo did actually segment out the solver into a separate module much like you're proposing. Do you really think it's that abysmal a change?

I think including everything in the discretization object in sample.py kind of goes against the object-oriented framework that we are using. Why not use the "old" discretization be a base class that all of this new stuff inherits from. So, I'm not saying you have to put everything in functions that act on a discretization object, but in the very least most of this could go in another file that builds upon the discretization base class defined here. Right now, opening up the sample.py file is incredibly daunting. There's simply too much in this file.

that's a good point. Are you thinking that I can keep the distributions attribute of the sample_set class, but move most of the diff of discretization into a separate module? That seems reasonable. Is the idea that all I would have to do is swap my import statements in any new examples... like import bet.sample as samp to import bet.directsample as samp?

smattis commented 4 years ago

There needs to be a pretty major restructuring. Right now all of the data consistent and iterated stuff is included directly in sample.py. That module is for defining the basic data structures of sample_set and discretization. The methods of defining distributions on these objects should be here. However, right now all of the data-consistent and iterated algorithms are part of the discretization object here. These should be in their own modules in the calculateP directory and take discretization objects as inputs. That is what we have done with the other methods. sample.py is way too hard to deal with in this form, and it is not how it is intended to be used. This restructuring will also have a big effect on the tests as well.

I do understand what you're saying, and I made the decision to include solution methods here purposefully. Especially when it came to evaluating the updated density, this architecture dramatically simplifies things. This makes iteration and defining observed densities part of the attributes that belong to the discretization object, rather than having a separate method act on them and adding attributes to the class. When I sketched out how it could possibly work through calculateP, it was far less flexible. It's been a while since I made those design decisions, I'll admit... but I definitely struggled through them before going this route. It was my second go at it, afterall. I more/less completely refactored consistentbayes in making these changes, and that repo did actually segment out the solver into a separate module much like you're proposing. Do you really think it's that abysmal a change?

I think including everything in the discretization object in sample.py kind of goes against the object-oriented framework that we are using. Why not use the "old" discretization be a base class that all of this new stuff inherits from. So, I'm not saying you have to put everything in functions that act on a discretization object, but in the very least most of this could go in another file that builds upon the discretization base class defined here. Right now, opening up the sample.py file is incredibly daunting. There's simply too much in this file.

that's a good point. Are you thinking that I can keep the distributions attribute of the sample_set class, but move most of the diff into a separate module? That seems reasonable. Is the idea that all I would have to do is swap my import statements in any new examples... like import bet.sample as samp to import bet.directsample as samp?

Yes. Something like that sounds great.

mathematicalmichael commented 4 years ago

closing since most of this is implemented in #382 , will build on top of that when it's merged to re-add functionality.