salilab / pmi

Python Modeling Interface
https://integrativemodeling.org/nightly/doc/ref/namespaceIMP_1_1pmi.html
12 stars 11 forks source link

Functions that could go in the PMI restraint base class #156

Open cgreenberg opened 8 years ago

cgreenberg commented 8 years ago

set_weight() set_label() add_to_model() get_restraint_set() evaluate() get_output() (maybe defined as abstract)

Pellarin commented 8 years ago

get_particle_to_sample ?

cgreenberg commented 8 years ago

Yeah! And we should start using the _NuisancesBase which you made. You can do multiple inheritance in python, so the ISD-style PMI restraints can have class XXXRestraint(PMIRestraint,NuisancesBase) or whatever

sethaxen commented 8 years ago

I'd like to take a stab at writing this base class. Another function which maybe should be included is get_restraint_for_rmf

Is get_restraint preferred over get_restraint_set?

Any additional methods which would be useful to have for all restraints but don't currently exist? Also, there are often other restraint sets that are only used in custom methods for a child class and in get_output. Having a dict like output_rs to which they may be added in child classes would simplify the get_output method where it often wouldn't need to be extended/reimplemented. Thoughts?

sethaxen commented 8 years ago

Other relevant issues are #80 and #39

saltzberg commented 8 years ago

Awesome! I'd say get_restraint_class is more syntactic since it will return the RestraintSet object.

This is probably a good start for now. We can always expand it later if we find overlapping function in the child classes.

I don't know what you mean by the other restraint sets. Where is there an example of that?

benmwebb commented 8 years ago

get_restraint_class doesn't really make any sense unless you're actually returning a class object. More likely you're returning an instance.

sethaxen commented 8 years ago

I agree with @benmwebb regarding get_restraint_class. Given that an instance of RestraintSet is returned, get_restraint_set might make the most sense.

An example of other restraint sets is in the CrossLinkingMassSpectrometryRestraint.

saltzberg commented 8 years ago

Oh, my bad, meant get_restraint_set

saltzberg commented 8 years ago

And I see the deal with multiple RestraintSet objects now. I agree @sdaxen, a dictionary would be useful in that case. Maybe just restraint_sets, since it's, in general, just a dictionary of all the restraint sets, rather than something specifically for output.

sethaxen commented 8 years ago

@saltzberg, to push back, I'm thinking child classes would still define such variables as self.XXXrs = .... So they still have access to them in their child class methods. The only reason why it would be important to collect them all together into a dict would be so that a base class method would be aware of them. Given the functions discussed above, that would be specifically for output.

Also, is there any reason it doesn't make sense to use the same name for a given RestraintSet that will be output as will be added to outputs (perhaps plus self.label)? This rule doesn't currently seem to be followed. e.g., CrossLinkingMassSpectrometryRestraint uses the name 'likelihood' for the main RestraintSet but "CrossLinkingMassSpectrometryRestraint_Data_Score_" + self.label for the output. What I'm proposing is that in that case, the RestraintSet would be named "CrossLinkingMassSpectrometryRestraint_Data_Score" and the self.label would be appended in the output.

saltzberg commented 8 years ago

Oh, I wasn't suggesting that the child classes should not do that. Simply that there may be another application for a dictionary of restraint sets outside of output, like evaluation.

Adding the name of the restraint to the RestraintSet object makes sense. In your example, I'd name it "CrossLinkingMassSpectrometryRestraint_Data" and then append + "_Score_" + self.label when outputting the restraint score.

sethaxen commented 8 years ago

I see. I tend to agree with you then. Restraints should be internally stored in some data structure. Names of RestraintSet objects should be included in output. Scores from all RestraintSet objects should be output, with + "_Score_" + self.label appended to the name. In this case, self.restraint_sets can for now just be a list.

This will standardize the output names and restraint names, which is nice. In the process though, it will change some names in output fields (i.e. anything which doesn't have "_Score_" in the name), which may break some analysis scripts others have written.

benmwebb commented 8 years ago

I also recommend that you make your changes as a pull request, then we can all discuss any issues with your implementation before it goes live.

sethaxen commented 8 years ago

@benmwebb a pull request with just the base class and a corresponding test or with all PMI restraints updated to use the base class?

benmwebb commented 8 years ago

Whichever works for you. Smaller changes are easier to review than bigger ones though.

sethaxen commented 8 years ago

submitted a pull request

sethaxen commented 8 years ago

On further thought, it appears that nearly every (if not every) instance of a method being reimplemented in a child of RestraintBase would be to include the value of a nuisance in the output, change which restraint is added to the RMF, change which restraints are added to the model, or change which particles are sampled. These are all decisions that can be made upon initialization. I propose including the following datastructures in the base class:

self.sample_particles = {}
self.output_particles = {}
self.rmf_restraint = self.rs
self.model_restraints = [self.rs]

These will be used in the base class methods. Thoughts?

benmwebb commented 8 years ago

Makes sense if every subclass needs that functionality. If not, it may make sense to introduce an intermediate class between RestraintBase and the subclasses that need it.

sethaxen commented 8 years ago

Well, as it stands, all restraints will have this functionality (these methods are currently defined in the new base class). This is really just an implementation detail that prevents implementers from needing to overload functions. I think it will also help in standardising what is output.

I suppose I could add something like RestraintWithNuisancesBase which has helper methods for creating/getting/outputting nuisances (similar to AtomicCrossLinkMSRestraint).

benmwebb commented 8 years ago

I suppose I could add something like RestraintWithNuisancesBase

There's already a class that was (presumably) intended to do that, pmi::restraints::_NuisancesBase. That could probably be modified appropriately.

sethaxen commented 8 years ago

Perhaps. It seems awfully specific to crosslinks. Looks like its only being used in the deprecated IMP.pmi.restraints.crosslinking.ISDCrossLinkMS. Any idea why it wasn't used in the IMP.pmi.restraints.crosslinking.CrossLinkingMassSpectrometryRestraint class?

benmwebb commented 8 years ago

No idea. PMI likes to reinvent the wheel a lot. You may need to just do this again, and burn the _NuisancesBase class.

procyon777 commented 8 years ago

Just one thing - please do not touch anything in IMP.pmi.restraints.crosslinking.ISDCrossLinkMS. It is not deprecated at least for me. All my current projects are heavily dependent on it.

sethaxen commented 8 years ago

@procyon777 no worries, I won't modify any of the code in ISDCrossLinkMS. If anything, I'll rename the current _NuisancesBase class and update ISDCrossLinkMS to inherit it. It'll behave identically. As a general rule, I don't intend to touch any deprecated restraints unless someone wants them updated to use the new restraint base.

sethaxen commented 8 years ago

Also, I think it really only makes sense to call set_label(label) once, before any outputs or particles have been requested, so unless anyone has objections, calling set_label(label) when the label has already been set will raise a ValueError. This will make it easier to append the label to any necessary names in subclasses.

Perhaps a good point for discussion is whether as a convention PMI Restraints should be initialized with weights and labels. About half of them have a weight and label parameter passed during initialization, while the other half do not. For simplicity, I lean towards them not being initialization parameters.

saltzberg commented 8 years ago

I think both label and weight should be passed as optional variables at initialization...my reasoning being, you can initialize the restraint with a minimal number of inputs, but it's fully implemented (i.e. it's immediately able to be added to the model/scoring function). The default for weight can be set to 1.0 and label can be something unique to that restraint.

I agree that changing the label could have unintended consequences, and I don’t see a case where someone would need to change it. To underscore that the user should not do this manually we could use __set_label(label) in addition to the ValueError. ​

On Tue, Oct 11, 2016 at 12:46 AM, Seth Axen notifications@github.com wrote:

Also, I think it really only makes sense to call set_label(label) once, before any outputs or particles have been requested, so unless anyone has objections, calling set_label(label) when the label has already been set will raise a ValueError. This will make it easier to append the label to any necessary names in subclasses.

Perhaps a good point for discussion is whether as a convention PMI Restraints should be initialized with weights and labels. About half of them have a weight and label parameter passed during initialization, while the other half do not.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/salilab/pmi/issues/156#issuecomment-252837451, or mute the thread https://github.com/notifications/unsubscribe-auth/AG7mwDj2B1oEuq0DHgPvicz7QsfCFkeVks5qyz7VgaJpZM4HNmYD .

Daniel Saltzberg Post-doctoral Scholar University of California at San Francisco Lab of Andrej Sali (www.salilab.org)

T: 415.514.4258

Mailing Address: UCSF MC 2552, Mission Bay, Byers Hall 1700 4th Street, Suite 503B San Francisco, CA 94158-2330

saltzberg@salilab.org ds229@bu.edu

sethaxen commented 8 years ago

On further thought, I agree with you, but for different reasons. If we're enforcing the weight and label are passed at most once before the restraint is used, it makes sense that they be optional initialization inputs. The current default label is empty. I think that makes sense since one might not need a unique string to identify the restraint if they only use one. What sort of unique default label were you thinking of?

Also, the weight should probably be applied to all internal restraint sets, which I don't think is always the case right now. Unless I'm misunderstanding what the weight was meant to accomplish.

Actually, if we're not intending the user to run set_label ever, there's no reason to use a setter, since setters aren't really "Pythonic". We can use properties.

iecheverria commented 8 years ago

One thought about labels and weights, I agree that they should, by default, be set automatically at initialization, or optionally passed at initialization. However I do work with simulations of different systems or states where I use different labels/weights for each system/state. So I would prefer that the option to different labels and weights is still available for restraints.

On Tue, Oct 11, 2016 at 8:56 AM, Seth Axen notifications@github.com wrote:

On further thought, I agree with you, but for different reasons. If we're enforcing the weight and label are passed at most once before the restraint is used, it makes sense that they be optional initialization inputs. The current default label is empty. I think that makes sense since one might not need a unique string to identify the restraint if they only use one. What sort of unique default label were you thinking of?

Also, the weight should probably be applied to all internal restraint sets, which I don't think is always the case right now. Unless I'm misunderstanding what the weight was meant to accomplish.

Actually, if we're not intending the user to run set_label ever, there's no reason to use a setter, since setters aren't really "Pythonic". We can use properties http://www.programiz.com/python-programming/property.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/salilab/pmi/issues/156#issuecomment-252960494, or mute the thread https://github.com/notifications/unsubscribe-auth/AMOTA5wdZLgAWQ28LqL83YV0Df-26PAbks5qy7GvgaJpZM4HNmYD .


Ignacia Echeverria Postdoctoral Scholar Department of Bioengineering and Therapeutic Sciences University of California, San Francisco

http://salilab.org/~ignacia

sethaxen commented 8 years ago

@iecheverria, so to clarify, you want to be able to modify the restraint label and weight sometime after initialization but before calling any of the methods to get particles or outputs?

iecheverria commented 8 years ago

@sdaxen, yes, that is correct.

sethaxen commented 8 years ago

Commit af9a472 handles the weights/labels as discussed above.

benmwebb commented 8 years ago

See http://chris.beams.io/posts/git-commit/ for tips on how to write a good git commit message.

sethaxen commented 8 years ago

@benmwebb thanks for the reference!