Open acaruana2009 opened 2 years ago
I did think about adding it to the CLI. Right now I'm suggesting the instrument scientists here test it out with
python -m refl1d.check_oversampling --tolerance 0.5 <model name.py>
, or that they add these lines to the end of their model script:
from refl1d.check_oversampling import check_fitproblem
check_fitproblem(problem, plot=False, tolerance=0.4)
This actually modifies the problem
object and leaves it at the optimal oversampling for an automatic "green mode" for users.
That is a good way to place it in the model scripts. I think in general it would be useful to have a CLI flag at some point with a description of what it does. Certainly getting in the habit of using it is a good thing.
Quibble: Might want to rename the function to set_default_oversampling
or update_oversampling
. I read check_oversampling
and I expect it to return a boolean or throw an error.
In general I prefer to use methods to modify objects and have functions just be functions, though I'm not completely pure about it. In this case it would be a method on Experiment
which admittedly is a lot more awkward.
Is there a reason we can't use 'oversampling="auto"' or some such as the default in Experiment so the user doesn't have to do anything? Though I guess we don't want to change existing models, so we should default to None and have them laboriously type it on every model. Not sure how we can transition this smoothly (common problem: np.std(a, ddof=1)
because the first implementation got it wrong).
Before we suggest that users add this blindly to their models (or implementing an 'auto' flag), I'd like to come to some kind of agreement for how to handle models that require oversampling only in a very narrow region. For example, check_oversampling
gives me a peak value of 30-100 for some of my models but only over about 10% of the Q range. I'm concerned that users who put this in naively, or fail to turn off an 'auto' flag, will experience a significant fitting slowdown.
No question this isn't ready for default usage, @hoogerheide. It is clear it needs work and part of that is making it more granular in applying sampling. That said I think users would be better served by applying too much oversampling in the last step of their fit, than too little.
Also the function and module names need work as suggested by @pkienzle but I deferred renaming until the functions themselves become more stable and we get feedback (as you all are providing!)
Any suggestions on module/function renaming prior to upcoming release?
Perhaps name the module simply oversampling
. Instead of check_fitproblem
perhaps analyze_fitproblem
?
It is a really useful tool for setting up the correct oversampling and something we should encourage users to use during fitting. At the moment though, it is a little bit of a bolt on that is slightly tricky to use.
@bmaranville I don't know how you envisaged this being used, but my thoughts were that this initially this could be added to the CLI opts. I also could see the benefit of having it as a check in the GUI - before and after fitting as we discussed before.