MNGuenther / allesfitter

allesfitter is a convenient wrapper around the packages ellc (light curve and RV models), dynesty (static and dynamic nested sampling) emcee (Markov Chain Monte Carlo sampling) and celerite (Gaussian Process models).
MIT License
60 stars 36 forks source link

[Bug] Missing last companion Total Transit Duration #36

Closed martindevora closed 3 years ago

martindevora commented 3 years ago

Hi, I was able to generate the results directory for a 5-candidates system. I found out a very weird system, which might probably point to False candidates. I wanted to do some checks in the derived data when I realised that the last companion was missing the Total and Full Transit Duration from the ns_derived_table files (latex tables and csv file). I'm attaching the entire allesfitter working directory for you to be able to debug the case.

The link to the allesfitter dir (the file is too large to be attached here): https://drive.google.com/file/d/1AQO_MTCutPOQjlfOHgqfFMYrdeElv0O0/view?usp=sharing

I'm running allesfitter 1.2.2.

Regards.

martindevora commented 3 years ago

I opened the derived pickle file. The values 'SOI_5_T_tra_tot' and 'SOI_5_T_tra_full' contain a first NaN value at the index = 0. Therefore, the piece of code from deriver.derive doing this:

ind_good = []
    for i,name in enumerate(names):
        if (name in derived_samples) and isinstance(derived_samples[name], np.ndarray) and not any(np.isnan(derived_samples[name])) and not all(np.array(derived_samples[name])==0):
            ind_good.append(i)

seems to be filtering the durations for this companion. I guess this is intended, but I'm wondering what is the cause of having NaN values and why we would ignore this derived parameter when only having one NaN value out of more than 30,000 as my current scenario.

Kind regards, Martín.

MNGuenther commented 3 years ago

Hi again @martindevora,

Ha, thanks for pushing allesfitter to its limits and providing all those detailed insights!

Hm... the question is, why is the first entry a NaN? There shouldn't be any NaN in that array if things go normal. It particularly puzzles me that it is the first one, too. This smells like something with the sampler went wrong. Have you had any new insights into this system since then? I'd be especially curious about re-running it from a different random state and checking the outcome.

Generally speaking, our philosophy in allesfitter is to err on the site of caution, given so many possible use cases and things-that-may-go-wrong. Since this is such an edge case, I would put this issue on ice for now.

(Again, thanks for your patience while I was moving from MIT to ESA this summer.)

martindevora commented 3 years ago

Hi again here in some additional thread : )

I don't remember which one, but this happened again two weeks ago with some other system I was exploring too. It seemed that the last candidate (as happened in the case I uploaded to this issue) was missing the duration due to the same problem (first value from the samples is NaN). So I believe this is not an isolated problem. I must recognize that I'm trying to fit very shallow transits, but still, I'm providing the initial priors for the duration.

PS: Good luck with your new position in the ESA!

MNGuenther commented 3 years ago

Hi @martindevora,

Ok, twice is too often :) In v1.2.7 (pip-installable by this evening) I have replaced the not any() with an not all(), and am using np.nanpercentile functions to allow computing the desired medians and lower/upper limits. I tested it on your example above and it ran well.

(I am still not happy that there is a NaN in the first position of the samples, but it is hard to diagnose why that is.)

Please feel free to reopen this if you find a continuing/related issue in v1.2.7 and beyond.