rpoleski / MulensModel

Microlensing Modelling package
https://rpoleski.github.io/MulensModel/
Other
56 stars 15 forks source link

example_16: added color and negative blending prior #122

Closed mjmroz closed 3 months ago

mjmroz commented 8 months ago

For both sources

codecov-commenter commented 8 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (2e55c55) 83.61% compared to head (68a211d) 83.59%. Report is 23 commits behind head on master.

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #122 +/- ## ========================================== - Coverage 83.61% 83.59% -0.03% ========================================== Files 52 52 Lines 8271 8313 +42 ========================================== + Hits 6916 6949 +33 - Misses 1355 1364 +9 ``` | [Flag](https://app.codecov.io/gh/rpoleski/MulensModel/pull/122/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Radek+Poleski) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/rpoleski/MulensModel/pull/122/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Radek+Poleski) | `83.59% <ø> (-0.03%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Radek+Poleski#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

rpoleski commented 8 months ago

There are a few things to discuss or modify:

rpoleski commented 7 months ago

@mjmroz If you want to finish this, then I suggest you start with 1st and 4th item on the list above. There is not much work needed.

rpoleski commented 6 months ago

@mjmroz Did you need negative blending flux prior for more than one dataset?

rapoliveira commented 6 months ago

My only sugggestion to this new API is to replace source_number and similar words to source_flux or data_number or data_label, so it's not confused with having more than one source star.

@mjmroz I have some minor suggestions to make the code more readable, we can chat in person if you want.

rpoleski commented 4 months ago

Hi. I see progress, but still _parse_fit_constraints() is way too long. Also _run_flux_checks_ln_prior has to be made shorter. The _sumup_inside_prior() needs a docstring.

rpoleski commented 4 months ago

Sorry, closed by mistake. Reopening.

rpoleski commented 4 months ago

One more aspect that I think should be changed is how the user specifies datasets to be used. Instead of numbers (which are easy to confuse, e.g., when somebody adds a dataset to existing yaml file), I suggest to use labels that can be accessed via MulensData.plot_properties['label'], which defaults to the base of file name.

rpoleski commented 4 months ago
  1. Where is the check that the two datasets don't have the same label?
  2. inside = self...() in _run_flux_checks_ln_prior() are wrong, should be +=.
  3. please use full words for variable names (e.g., index instead of idx).
  4. Example improvement, around line 2260:
    for i in arange(len(self._n_fluxes_per_dataset) - 1):
     inside += self._sumup_inside_prior(fluxes, key, inside, i)

@mjmroz

mjmroz commented 4 months ago

@rpoleski I implemented a check for using different datasets to calculate the color ( #L1525 ). Did you mean that a general check to ensure MulensData.plot_properties['label'] is unambiguous is missing? I can add this as well.

Have you considered adding an option to declare different identifiers for datasets in the input YAML file that are not passed to MulensData.plot_properties['label']? One might need to have more than one dataset on the plot under one label but treat them differently in modeling.

Which way of declaring color prior in the YAML file do you prefer?

#negative_blending_flux_sigma_mag: [20., "OGLE I-band", "OB03235_MOA.txt"]
# Alternative sharp constraint:
# no_negative_blending_flux: True
#color constrains, where color=flux_S_dataset_k/flux_S_dataset_m: [gauss, mu ,sigma, k, m ,]
color : ["gauss", 25., 5. , "OGLE I-band" ,"OB03235_MOA.txt" ]

or

negative_blending_flux_sigma_mag: 20.  
#one can specify for which dataset soft constrain on blending flux should be applied: sigma dataset_number(s)
soft_negative_blending_flux:
    sigma_mag: 20.
    data_sets: ["OGLE I-band","OB03235_MOA.txt" ] #optional
# Alternative sharp constraint:
# no_negative_blending_flux: True
#color constrains, where color=flux_S_dataset_k/flux_S_dataset_m
#color:
#  settings : gauss mu sigma
#  color_from : [k, m]
color :
    settings: gauss 25. 5.  
    color_from : [ "OGLE I-band", "OB03235_MOA.txt"]
rpoleski commented 4 months ago

Line 1525 says settings[3] = self._get_no_of_dataset_by_lable(settings[3]) and I don't see how it would prevent two datasets to have the same label. There should be such a test if the user specifies the color constraint.

Have you considered adding an option to declare different identifiers ...

No. It's not needed at this point.

Preferred formats:

color: gauss 2.0 0.1 "OGLE V-band" "OGLE I-band"
# or:
color: [gauss, 2.0, 0.1, OGLE V-band, "OGLE I-band"]

i.e., IDs instead of numbers and distribution clearly specified. One aspect is missing at this point: do we want flux ratio or magnitude difference? I'll be for the latter because that is what astronomers are used to.

I also suggest:

  1. lable -> label
mjmroz commented 4 months ago

I meant this line 1525 :) : https://github.com/mjmroz/MulensModel/blob/c7c558b688b12abbf5be855e5a8a9372612edb69/examples/example_16/ulens_model_fit.py#L1525 : if settings[3] == settings[4]

As I wrote, it checks if one is trying to calculate color from the same datasets, for example, color = OGLE-I/OGLE-I. In our previous live conversation, when you asked if I have a check to ensure datasets' labels are different, I misunderstood you and thought you were asking about this one, so I said yes. I'm sorry for that. In my code, there is no check if one label is used multiple times in the declaration of MulensData.plot_properties['label']. I will add it.

color: gauss 2.0 0.1 "OGLE V-band" "OGLE I-band"

or:

color: [gauss, 2.0, 0.1, OGLE V-band, "OGLE I-band"]

The first option is now implemented, but it requires importing modules like shlex to deal with quotations inside a str. I will use the latter one.

I'm using fluxes because it was the easiest way in my case. I needed a color prior when I was adding a second source to the model, so I already had results in fluxes for the first source. If you think it won't be the most common case, I will change to magnitudes or add an option for both: color: [gauss, 2.0, 0.1, OGLE V-band, "OGLE I-band", "mag" or "flux"]

rpoleski commented 4 months ago

Current summary of what is missing:

Current format of input with shlex is ok. I've realized the other one (with a list) would require some more checks and there is no need for that now. Astronomers very rarely use flux ratios for colors, so let's switch to magnitudes.

rpoleski commented 3 months ago

Thanks for all these updates and for correcting older spelling errors.

Two final corrections:

I'll merge this PR afterwards. Thanks once more.