Closed JosephKarpinski closed 3 years ago
Here the Md star parsec distances calculated by Deep Learning appear to be consistently off across multiple Sloan Digital Sky Survey Apogee DR16 star fields
Hi @JosephKarpinski , thanks for reporting the issue with all the detail.
The reason why NN distance is very wrong for Md stars is because they are not really many of them in our training set due to cuts as we focus mostly on giants. You can check dist_error
to check how certain we are on NN dist
. Moreover we recommend to cut out all the stars where NN logg has more than 0.2dex uncertainty (i.e. logg_err
which Md stars generally have almost 0.4dex uncertainty on logg from NN model)
When plotting the Md stars in Orion, the stars piling up at ~400pc for Gaia parallax because Gaia parallax are good at such short distance while NN distance are everywhere and errorbar is huge.
Well, here’s the breakdown of stars within Apogee DR16. I’m expecting we will see possibly double those counts in Apogee DR17, December 2021
Sent from my iPad
On Oct 28, 2021, at 2:46 PM, Henry Leung @.***> wrote:
Hi @JosephKarpinski , thanks for reporting the issue with all the detail.
The reason why NN distance is very wrong for Md stars is because they are not really many of them in our training set due to cuts as we focus mostly on giants. You can check dist_error to check how certain we are on NN dist. Moreover we recommend to cut out all the stars where NN logg has more than 0.2dex uncertainty (i.e. logg_err which Md stars generally have almost 0.4dex uncertainty on logg from NN model)
When plotting the Md stars in Orion, the stars piling up at ~400pc for Gaia parallax because Gaia parallax are good at such short distance while NN distance are everywhere and errorbar is huge.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
Here’s the thing. Are the AstroNN distance values questionable for all Apogee dwarf stars, given it’s focus on giants? The large sample of GKd stars? Not sure how this would impact any AstroNN dwarf generated metrics. Looking more into GKd impact …
Yes astroNN distances of dwarfs are generally questionable especially for stars with logg uncertainty>0.2dex. The focus of giants is because our goal to map the milkyway at a large distance and since this neural network works by predicting the luminosity of stars and we have approx 7% typical uncertainty in luminosity will be translated into approx 7% distance uncertainty, neural network that predicts luminosity (thus distance with apparent magnitude) probably can never outperform Gaia which uses geometric parallax at such a close distance.
Considering the target selection of APOGEE which dwarfs wont be selected at a far distance because they will be too dim to be selected (thus only giants are selected at a great distance if your goal is to map the MillkyWay in large volume anyway), I would always recommend to use Gaia parallax to get the distance to dwarfs in APOGEE even if astroNN produces reasonable distance to dwarfs, since Gaia geometric parallax will always be much better for them.
Thank you.
I’ll follow your suggestion when looking at closer targets. It will be interesting to see if Apogee DR17 uses Gaia EDR3 data and how that improves distances values beyond 2k parsecs.
Let’s close the issue.
Best Regards,
Joseph Karpinski
Sent from my iPad
On Oct 30, 2021, at 12:48 AM, Henry Leung @.***> wrote:
Yes astroNN distances of dwarfs are generally questionable especially for stars with logg uncertainty>0.2dex. The focus of giants is because our goal to map the milkyway at a large distance and since this neural network works by predicting the luminosity of stars and we have approx 7% typical uncertainty in luminosity will be translated into approx 7% distance uncertainty, neural network that predicts luminosity (thus distance with apparent magnitude) probably can never outperform Gaia which uses geometric parallax at such a close distance.
Considering the target selection of APOGEE which dwarfs wont be selected at a far distance because they will be too dim to be selected (thus only giants are selected at a great distance if your goal is to map the MillkyWay in large volume anyway), I would always recommend to use Gaia parallax to get the distance to dwarfs in APOGEE even if astroNN produces reasonable distance to dwarfs, since Gaia geometric parallax will always be much better for them.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
Hi,
Focused on Sloan Digital Sky Survey Apogee DR16 star fields with large numbers of GKg stars. AstroNN “dist” values look closer to Gaia DR2 “Parsec” values at 2K parsecs.
Sent from my iPad
On Oct 30, 2021, at 12:48 AM, Henry Leung @.***> wrote:
Yes astroNN distances of dwarfs are generally questionable especially for stars with logg uncertainty>0.2dex. The focus of giants is because our goal to map the milkyway at a large distance and since this neural network works by predicting the luminosity of stars and we have approx 7% typical uncertainty in luminosity will be translated into approx 7% distance uncertainty, neural network that predicts luminosity (thus distance with apparent magnitude) probably can never outperform Gaia which uses geometric parallax at such a close distance.
Considering the target selection of APOGEE which dwarfs wont be selected at a far distance because they will be too dim to be selected (thus only giants are selected at a great distance if your goal is to map the MillkyWay in large volume anyway), I would always recommend to use Gaia parallax to get the distance to dwarfs in APOGEE even if astroNN produces reasonable distance to dwarfs, since Gaia geometric parallax will always be much better for them.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
Yes APOGEE DR17 will be using Gaia eDR3 and eDR3 parallax does improve quite a lot but neural network distance is still much better beyond a few kpc.
If you want Gaia eDR3 parallax with APOGEE DR16, you can use my script here to generate Gaia eDR3 data file row-matched to APOGEE allstar file: https://github.com/henrysky/astroNN_APOGEE_VAC/blob/master/2_gaia_xmatch.py
System information
Describe the problem
astroNN Gaia DR2 parallax zero-point offset with deep learning
Gaia DR2 calculates it as −0.029 mas. Sloan Digital Sky Survey Apogee calculates it as −0.0523 mas. Modified parallax = parallax - zero point offset Data model: apogee_astroNN provides spectro-photometric deep learning parsec distances. Distance in parsecs to the Orion Nebula for star classes BA, Fd, GKd and GKg pretty much agree. But astroNN appears to produce 4-5 times larger distances for Md and Mg stars.
Parsecs calculated with parallax zero point offset options: Parsec- no offset Dist - Apogee Deep Learning DistApogee - use Apogee offset DistGaia - use Gaia offset
Source code / logs
Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached. Try to provide a reproducible test case that is the bare minimum necessary to generate the problem.
Suggestion
Optional, if you have any idea how to fix the issue