stevengiacalone / triceratops

Tool for Rating Interesting Candidate Exoplanets and Reliability Analysis of Transits Originating from Proximate Stars
https://triceratops.readthedocs.io
MIT License
20 stars 7 forks source link

[Question] Use triceratops probabilities as priors #23

Closed martindevora closed 5 months ago

martindevora commented 1 year ago

Hello, I'm working on a new neural network to perform vetting on exoplanet candidates. I was reading the Exominer (https://ui.adsabs.harvard.edu/abs/2022ApJ...926..120V/abstract) paper and they claimed the usage of the combination of the Exominer scores with the probabilities given by the priors computed by Armstrong et al. 2021 with the method described by Bryson & Morton 2017 to improve their results. I looked at those papers and thought that maybe I could also improve my neural network results by combining them with the TRICERATOPS probabilities. However, I'm not sure whether it'd be correct and/or what would be the proper way to choose the P(s=1 | TRICERATOPS) and P(s=0 | TRICERATOPS).

Do you have any suggestions or want to chat more about it?

Kind regards and thanks in advance. Martín.

stevengiacalone commented 1 year ago

I think that's an interesting idea. Strictly speaking, I wouldn't consider the TRICERATOPS FPPs to be proper posterior probabilities (despite the word "probability" being in the acronym). It's more of an approximated marginal likelihood. In that way, they differ from the positional probabilities from Bryson & Morton 2017 (which are proper Bayesian posteriors, as far as I understand). I don't think that means TRICERATOPS FPPs can't be used, though. Adding the FPP has an input at some level might yield interesting results.

That said, you may want to be careful to avoid biases that using a TRICERATOPS prior might introduce. My understanding is that ML algorithms are trained using previously confirmed planets, some of which are confirmed because they were validated with TRICERATOPS in the past. So, in a sense, the FPP is implicitly contained in the training set. Introducing FPP again at some other level might influence the results in an undesirable way, although I can't say definitively what the effect would be.

It's difficult for me to make concrete suggestions, since I'm far from an expert on neural networks. But I would e happy to talk more about this if you do end up trying it out!

martindevora commented 1 year ago

Hello Steven,

when you say that the probabilities returned by TRICERATOPS are marginal likelihoods, you mean they are bayesian evidences and hence not posteriors? Anyhow, even when my understanding of the bayesian statistics is limited, I'd think that I could use TRICERATOPS outputs as priors together to my Neural Network ones to get more informed scores.

I think that using validated planets by TRICERATOPS in the training set would only be problematic if there were too many. How many planets have been validated with it? In case there are plenty of them, I could train the network without them in the training set as I might be doing with those from ExoMiner and some other authors.

If you don't see a critical blocker on my approach, I will try it for some of the candidates that my network would be pointing to, will compare the outputs to the ones given by Bryson & Morton 2017 and will come later to share the results to you. Maybe it is an interesting test to compare the positional probabilities with the results returned by TRICERATOPS for a given list of targets, independently to the tests I will do for my neural network.

Kind regards. Martín.

stevengiacalone commented 1 year ago

when you say that the probabilities returned by TRICERATOPS are marginal likelihoods, you mean they are bayesian evidences and hence not posteriors?

Yes, that's correct. There are no model priors used in the calculation.

I think that using validated planets by TRICERATOPS in the training set would only be problematic if there were too many. How many planets have been validated with it?

That's probably true. I don't know the exact number of planets validated with TRICERATOPS, but I would place it somewhere between 25 and 50 right now. So it's only a small fraction of planets on the confirmed planet list.

If you don't see a critical blocker on my approach, I will try it for some of the candidates that my network would be pointing to, will compare the outputs to the ones given by Bryson & Morton 2017 and will come later to share the results to you.

Sounds good, I have no objections! Let me know how it turns out!