JonasBchrt / snapshot-gnss-algorithms

Algorithms for location estimation based on short GNSS snapshots.
ISC License
23 stars 13 forks source link

bayes_snapper.npy outdated and undocumented #7

Open geissdoerfer opened 6 months ago

geissdoerfer commented 6 months ago

The repository contains a bayes_snapper.npy that apparently contains a bayesian model to rate satellite quality. Unfortunately the model seems to have been generated with an outdated scikit version that is not compatible anymore with recent Python. Would you be able to provide the data and method to generate the model? This would benefit reproducibility. Thanks!

JonasBchrt commented 6 months ago

I will have a look - give me a few days.

Some preliminary notes:

This is from my thesis:

At first, it derives a prior probability P (vi = 1|SNRi) for each satellite observation i ∈ 1, . . . N to be reliable, i.e., to be a so-called inlier, given the associated SNR. The distribution p (SNRi) of the SNRs is modelled as a Gaussian mixture model with two components, p (SNRi|vi = 1) for the inliers and p (SNRi|vi = 0) for the outliers. Mean, standard deviation, and prior of each component are fitted to a training dataset. This is done separately for each GNSS since the GPS L1 signal, the Galileo E1 signal, and the BeiDou B1C signal have different properties and, therefore, differently distributed SNRs. Using the resulting probabilistic models and Bayes’ rule, the priors P (vi = 1|SNRi) = p(SNRi|vi=1)P(vi=1)/p(SNRi) for each satellite to be an inlier and P (vi = 0|SNRi) = 1 − P (vi = 1|SNRi) to be an outlier are obtained.

Footnote:

Technically, SNRs are strictly positive while a Gaussian distribution’s support includes all non-positive numbers, too. However, a Gaussian distribution is chosen because the probability contained in the distribution’s tail that extends into the negative numbers is negligibly small for the considered problem in practice. In addition, efficient algorithms for interference exist for Gaussian distributions.

geissdoerfer commented 6 months ago

Thanks for the explanation, it makes sense. The table of labeled training data (csv?) and a script to train the model in the repository would be very helpful!