Open cgreenberg opened 8 years ago
get_particle_to_sample ?
Yeah! And we should start using the _NuisancesBase which you made. You can do multiple inheritance in python, so the ISD-style PMI restraints can have class XXXRestraint(PMIRestraint,NuisancesBase)
or whatever
I'd like to take a stab at writing this base class. Another function which maybe should be included is get_restraint_for_rmf
Is get_restraint
preferred over get_restraint_set
?
Any additional methods which would be useful to have for all restraints but don't currently exist? Also, there are often other restraint sets that are only used in custom methods for a child class and in get_output
. Having a dict like output_rs
to which they may be added in child classes would simplify the get_output
method where it often wouldn't need to be extended/reimplemented. Thoughts?
Other relevant issues are #80 and #39
Awesome! I'd say get_restraint_class
is more syntactic since it will return the RestraintSet
object.
This is probably a good start for now. We can always expand it later if we find overlapping function in the child classes.
I don't know what you mean by the other restraint sets. Where is there an example of that?
get_restraint_class
doesn't really make any sense unless you're actually returning a class object. More likely you're returning an instance.
I agree with @benmwebb regarding get_restraint_class
. Given that an instance of RestraintSet
is returned, get_restraint_set
might make the most sense.
An example of other restraint sets is in the CrossLinkingMassSpectrometryRestraint
.
Oh, my bad, meant get_restraint_set
And I see the deal with multiple RestraintSet
objects now. I agree @sdaxen, a dictionary would be useful in that case. Maybe just restraint_sets
, since it's, in general, just a dictionary of all the restraint sets, rather than something specifically for output.
@saltzberg, to push back, I'm thinking child classes would still define such variables as self.XXXrs = ...
. So they still have access to them in their child class methods. The only reason why it would be important to collect them all together into a dict would be so that a base class method would be aware of them. Given the functions discussed above, that would be specifically for output.
Also, is there any reason it doesn't make sense to use the same name for a given RestraintSet
that will be output as will be added to outputs (perhaps plus self.label
)? This rule doesn't currently seem to be followed. e.g., CrossLinkingMassSpectrometryRestraint
uses the name 'likelihood'
for the main RestraintSet
but "CrossLinkingMassSpectrometryRestraint_Data_Score_" + self.label
for the output. What I'm proposing is that in that case, the RestraintSet
would be named "CrossLinkingMassSpectrometryRestraint_Data_Score"
and the self.label
would be appended in the output.
Oh, I wasn't suggesting that the child classes should not do that. Simply that there may be another application for a dictionary of restraint sets outside of output, like evaluation.
Adding the name of the restraint to the RestraintSet
object makes sense. In your example, I'd name it "CrossLinkingMassSpectrometryRestraint_Data"
and then append + "_Score_" + self.label
when outputting the restraint score.
I see. I tend to agree with you then. Restraints should be internally stored in some data structure. Names of RestraintSet
objects should be included in output. Scores from all RestraintSet
objects should be output, with + "_Score_" + self.label
appended to the name. In this case, self.restraint_sets
can for now just be a list
.
This will standardize the output names and restraint names, which is nice. In the process though, it will change some names in output fields (i.e. anything which doesn't have "_Score_"
in the name), which may break some analysis scripts others have written.
I also recommend that you make your changes as a pull request, then we can all discuss any issues with your implementation before it goes live.
@benmwebb a pull request with just the base class and a corresponding test or with all PMI restraints updated to use the base class?
Whichever works for you. Smaller changes are easier to review than bigger ones though.
submitted a pull request
On further thought, it appears that nearly every (if not every) instance of a method being reimplemented in a child of RestraintBase would be to include the value of a nuisance in the output, change which restraint is added to the RMF, change which restraints are added to the model, or change which particles are sampled. These are all decisions that can be made upon initialization. I propose including the following datastructures in the base class:
self.sample_particles = {}
self.output_particles = {}
self.rmf_restraint = self.rs
self.model_restraints = [self.rs]
These will be used in the base class methods. Thoughts?
Makes sense if every subclass needs that functionality. If not, it may make sense to introduce an intermediate class between RestraintBase and the subclasses that need it.
Well, as it stands, all restraints will have this functionality (these methods are currently defined in the new base class). This is really just an implementation detail that prevents implementers from needing to overload functions. I think it will also help in standardising what is output.
I suppose I could add something like RestraintWithNuisancesBase
which has helper methods for creating/getting/outputting nuisances (similar to AtomicCrossLinkMSRestraint
).
I suppose I could add something like
RestraintWithNuisancesBase
There's already a class that was (presumably) intended to do that, pmi::restraints::_NuisancesBase
. That could probably be modified appropriately.
Perhaps. It seems awfully specific to crosslinks. Looks like its only being used in the deprecated IMP.pmi.restraints.crosslinking.ISDCrossLinkMS
. Any idea why it wasn't used in the IMP.pmi.restraints.crosslinking.CrossLinkingMassSpectrometryRestraint
class?
No idea. PMI likes to reinvent the wheel a lot. You may need to just do this again, and burn the _NuisancesBase
class.
Just one thing - please do not touch anything in IMP.pmi.restraints.crosslinking.ISDCrossLinkMS. It is not deprecated at least for me. All my current projects are heavily dependent on it.
@procyon777 no worries, I won't modify any of the code in ISDCrossLinkMS
. If anything, I'll rename the current _NuisancesBase
class and update ISDCrossLinkMS
to inherit it. It'll behave identically.
As a general rule, I don't intend to touch any deprecated restraints unless someone wants them updated to use the new restraint base.
Also, I think it really only makes sense to call set_label(label)
once, before any outputs or particles have been requested, so unless anyone has objections, calling set_label(label)
when the label has already been set will raise a ValueError
. This will make it easier to append the label to any necessary names in subclasses.
Perhaps a good point for discussion is whether as a convention PMI Restraints should be initialized with weights and labels. About half of them have a weight
and label
parameter passed during initialization, while the other half do not. For simplicity, I lean towards them not being initialization parameters.
I think both label and weight should be passed as optional variables at initialization...my reasoning being, you can initialize the restraint with a minimal number of inputs, but it's fully implemented (i.e. it's immediately able to be added to the model/scoring function). The default for weight can be set to 1.0 and label can be something unique to that restraint.
I agree that changing the label could have unintended consequences, and I don’t see a case where someone would need to change it. To underscore that the user should not do this manually we could use __set_label(label) in addition to the ValueError.
On Tue, Oct 11, 2016 at 12:46 AM, Seth Axen notifications@github.com wrote:
Also, I think it really only makes sense to call set_label(label) once, before any outputs or particles have been requested, so unless anyone has objections, calling set_label(label) when the label has already been set will raise a ValueError. This will make it easier to append the label to any necessary names in subclasses.
Perhaps a good point for discussion is whether as a convention PMI Restraints should be initialized with weights and labels. About half of them have a weight and label parameter passed during initialization, while the other half do not.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/salilab/pmi/issues/156#issuecomment-252837451, or mute the thread https://github.com/notifications/unsubscribe-auth/AG7mwDj2B1oEuq0DHgPvicz7QsfCFkeVks5qyz7VgaJpZM4HNmYD .
Daniel Saltzberg Post-doctoral Scholar University of California at San Francisco Lab of Andrej Sali (www.salilab.org)
T: 415.514.4258
Mailing Address: UCSF MC 2552, Mission Bay, Byers Hall 1700 4th Street, Suite 503B San Francisco, CA 94158-2330
saltzberg@salilab.org ds229@bu.edu
On further thought, I agree with you, but for different reasons. If we're enforcing the weight and label are passed at most once before the restraint is used, it makes sense that they be optional initialization inputs. The current default label is empty. I think that makes sense since one might not need a unique string to identify the restraint if they only use one. What sort of unique default label were you thinking of?
Also, the weight should probably be applied to all internal restraint sets, which I don't think is always the case right now. Unless I'm misunderstanding what the weight was meant to accomplish.
Actually, if we're not intending the user to run set_label
ever, there's no reason to use a setter, since setters aren't really "Pythonic". We can use properties.
One thought about labels and weights, I agree that they should, by default, be set automatically at initialization, or optionally passed at initialization. However I do work with simulations of different systems or states where I use different labels/weights for each system/state. So I would prefer that the option to different labels and weights is still available for restraints.
On Tue, Oct 11, 2016 at 8:56 AM, Seth Axen notifications@github.com wrote:
On further thought, I agree with you, but for different reasons. If we're enforcing the weight and label are passed at most once before the restraint is used, it makes sense that they be optional initialization inputs. The current default label is empty. I think that makes sense since one might not need a unique string to identify the restraint if they only use one. What sort of unique default label were you thinking of?
Also, the weight should probably be applied to all internal restraint sets, which I don't think is always the case right now. Unless I'm misunderstanding what the weight was meant to accomplish.
Actually, if we're not intending the user to run set_label ever, there's no reason to use a setter, since setters aren't really "Pythonic". We can use properties http://www.programiz.com/python-programming/property.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/salilab/pmi/issues/156#issuecomment-252960494, or mute the thread https://github.com/notifications/unsubscribe-auth/AMOTA5wdZLgAWQ28LqL83YV0Df-26PAbks5qy7GvgaJpZM4HNmYD .
Ignacia Echeverria Postdoctoral Scholar Department of Bioengineering and Therapeutic Sciences University of California, San Francisco
@iecheverria, so to clarify, you want to be able to modify the restraint label and weight sometime after initialization but before calling any of the methods to get particles or outputs?
@sdaxen, yes, that is correct.
Commit af9a472 handles the weights/labels as discussed above.
See http://chris.beams.io/posts/git-commit/ for tips on how to write a good git commit message.
@benmwebb thanks for the reference!
set_weight() set_label() add_to_model() get_restraint_set() evaluate() get_output() (maybe defined as abstract)