Closed jigpu closed 3 months ago
Hello @jigpu, thank you so much for reporting this bug ! I will get back to you before next Wednesday
Hello @jigpu,
Although I'm not 100% done, I've reached some conclusions on both topics. I will get back to this in the next few days
I've taken a quick look and I think I understand what's happening here.
I've demonstrated that the special SPLICE
comment is correctly handled and does not cause problems.
I did that by simply moving the special comment someplace else.
The problem comes from the line starting with " 4"
right before the very last Epoch in the .24O file.
Our src::observation::record::is_new_epoch
determination method might not be robust enough, because it picked it up as a new Epoch candidate and that very content caused Hifitime to overflow.
Now if we look at this file, we can see that the last line of each Epoch is always empty.
This is corrupt and does not respect the RINEX definition.
" 4"
looks like compressed RINEX (so called CRINEX) content.
My guess is this file was generated by decompressing a RINEX and the tool that did that, in this very context (number of SV, number of observation..) has generated an empty line up until that point, which we kind of dealt with fine, and did not decompress the N-1 line of the N-1 Epoch, which we cannot deal with.
Another explanation is, this weird line is directly related to the "SPLICE" operation, since it comes right before it.
Splice is probably explained in the teqc
documentation.
Now if we move on to the cool stuff.
export RUST_LOG=trace # verbosity
Note that rinex-cli
cannot use Glonass satellite vehicles to resolve positions as of today, as stated in the front page.
This means it is pointless to load the .24g Glonass NAV file. Since we're working with a V2 (old) context here, that means we're only left with GPS vehicles. Fortunately we have GPS in the SP3 file.
After removing N-1 from .24O I'm getting a hard time getting any results from the position solver. Now I understand why. I first decided to run some of these commands to become more familiar with your dataset:
./target/release/rinex-cli \
-f .24o -f .24n -f sp3 \ # load the 3 files at once
-P GPS \ # previous sidenote
-g --clk --orbit --obs
The data is very high quality ⭐⭐⭐
All signals have very good power (see the SSI plot). This is most likely the result of stringent criteria at production level.
.24O sampling is fast (5s) I personnaly have never seen that (I'm used to the 30s standard).
The timeframe is narrow, we only have 2 hours of signal observations. That means we can only resolve using the first 2 hours, and the last 20 hours of the navigation data are pointless.
SP3 sampling is 15' which is the standard but it comes very slow compared to .24O.
In this context, the clock state will be determined by interpolating that data contained in the SP3, but 15' sample rate is slow, so you'll get that warning in the logs.
A geodetic marker is present in one of the headers (see the logs), I'll simply use that as the "truth" we should converge to and not provide custom coordinates.
./target/release/rinex-cli \
-f .24o -f .24n -f sp3 \ # load the 3 files at once
-P GPS \ # previous sidenote
-p --spp
All Epochs fail to resolve because all SV were dropped (not enough candidates to resolve a position
), without further explnation.
I debugged, and it turns out this line never resolves.
This is where we compute the time of signal emission by the satellite.
As of now I have no explanation and will continue debug in the next few days.
All I can say is, this is bad luck: it's the only case where we do not generate a debug message, therefore get no reasons why the vehicles is dropped : I will fix that.
The problem comes from the line starting with
" 4"
right before the very last Epoch in the .24O file. Oursrc::observation::record::is_new_epoch
determination method might not be robust enough, because it picked it up as a new Epoch candidate and that very content caused Hifitime to overflow.Now if we look at this file, we can see that the last line of each Epoch is always empty. This is corrupt and does not respect the RINEX definition.
" 4"
looks like compressed RINEX (so called CRINEX) content. My guess is this file was generated by decompressing a RINEX and the tool that did that, in this very context (number of SV, number of observation..) has generated an empty line up until that point, which we kind of dealt with fine, and did not decompress the N-1 line of the N-1 Epoch, which we cannot deal with.Another explanation is, this weird line is directly related to the "SPLICE" operation, since it comes right before it. Splice is probably explained in the
teqc
documentation.
I poked around trying to find / understand the RINEX 2.11 spec, and I get the feeling that the spurious "4 1" line is both intentional and related to the "SPLICE" operation. In particular, see the APPENDEX section of https://files.igs.org/pub/data/format/rinex211.txt, TABLE A7. There are a few records without an epoch, but a flag of "4", just like the files we're dealing with.
TABLE A2 is not super clear, but it seems to indicate that when the epoch flag is between 2 and 5, the following field contains the "number of special records to follow". Jumping back to TABLE A7, in every case where the flag is between 2 and 5, it looks like the number of special records is just the number of additional lines that need to be read. In the case of "4 N", this appears to mean "read N header lines" --- and indeed, this seems like it can be used as a way of signaling that N comment lines follow. I imagine that any other kind of header line could be seen, but comment is certainly one type of header line.
This explanation also nicely fits the splices I've observed. The "4 1" in the downloaded file occurs immediately before a one-line comment. And the "4 11" in the second code block of my report occurs immediately before an 11-line comment.
Now if we move on to the cool stuff.
[ snipped for brevity ]
Neat! I'll need to poke around a little more with some of the poorer-quality data to see what it looks like in comparison!
.24O sampling is fast (5s) I personnaly have never seen that (I'm used to the 30s standard).
The CORS map indicates that most of my local reference stations seem to be sampling at 5 and 15 second intervals. There's even a few that it says run at 1 second, especially up in Washington.
SP3 sampling is 15' which is the standard but it comes very slow compared to .24O.
In this context, the clock state will be determined by interpolating that data contained in the SP3, but 15' sample rate is slow, so you'll get that warning in the logs.
That's an interesting tidbit to keep in mind...
A geodetic marker is present in one of the headers (see the logs), I'll simply use that as the "truth" we should converge to and not provide custom coordinates.
The RINEX files from my phone don't have this data in the header, so I've had to provide custom coordinates. I'm just wondering what the effect of providing "wrong" coordinates is. Is it like something like: your custom coordinates were off by X feet east, so the solution is going to be off by X feet east? Or more like as long as your coordinates are not "close enough" to correct, then we won't be able to resolve a solution?
All Epochs fail to resolve because all SV were dropped (
not enough candidates to resolve a position
), without further explnation.I debugged, and it turns out this line never resolves. This is where we compute the time of signal emission by the satellite. As of now I have no explanation and will continue debug in the next few days. All I can say is, this is bad luck: it's the only case where we do not generate a debug message, therefore get no reasons why the vehicles is dropped : I will fix that.
Best of luck with the debugging!
The CORS map indicates that most of my local reference stations seem to be sampling at 5 and 15 second intervals. There's even a few that it says run at 1 second, especially up in Washington.
This helps reduce the portion of a day you can analyze and still hope for good centimetric results.
I can't say for PPP since we don't have that capability yet, but it will obviously converge faster, at the expense of more computation load. You can see that our SPP++ solver consumes 1/4 of a day of high quality data sampled at 30 sec, to converge to cm/s velocity (test_resources/CRNX/V3/ESBCDNK
and associated data for that day):
BTW the quality of that data is lower than yours
That's an interesting tidbit to keep in mind...
Yeah, it's quite tedious to have to deal with all these files, part of the problem is the complexity of the task. It also takes a while to truly understand what theyrepresent. This particular problem comes from the fact that typical SP3 files are published with 15' sampling which is too slow for ultra precise clock interpolation. To "answer" that problem, they introduced so called Clock RINEX - that we support now - which provide SV clock states sampled almost at the same time as signal observation. Conclusion: people aiming for ultra precise results usually provide one .24o, one .24n, one SP3 and one CLK file to rtklib
The RINEX files from my phone don't have this data in the header
That is normal because this information comes from a geodetic survey, which is quite a complex task and is far from a "real time" thing. I'm interested in understanding how you managed to have your phone spit out RINEX though :) but that should probably come in another discussion, or even could an new entry in our Wiki ?
I'm just wondering what the effect of providing "wrong" coordinates is. Is it like something like: your custom coordinates were off by X feet east, so the solution is going to be off by X feet east? Or more like as long as your coordinates are not "close enough" to correct, then we won't be able to resolve a solution?
Currently the coordinates you can input are considered the "truth", that means the result of the station (laboratory) geodetic survey and we use that to compare where we converge. As you've probably guessed, this only applies to stationnary and high precision usecases. So currently, our tool allows providing a geodetic survey yourself, or provide a "better" truth position. Providing "wrong" coordinates here will just increase the error we estimate as we do not converge towards that position, it will not cause the solver to panic.
Eventually, our tool should allow a secondary argument to provide the "initial" (user guess) position.
This is the T0 initial position fed to the solver. Good PPP solvers like RTKlib allow that.
Our current behavior is to consume a first result as the initial guess. You can see this as the best thing to do without user input.
The consequence is slower convergence (several Epochs consumed in that process).
For people aiming at both precise and fast convergence (like almost real time PPP): this is bothering.
But that scenario requires the initial guess to be good (like a geodetic survey result).
Another solution is to pass (x0=0, y0=0, z0=0) center of the Earth as the initial guess, which is totally fine because the accumulator will converge fast from that ""very wrong"" initial guess. This is a possibility that rtklib also gives. The consequence of that, is you get very "wrong" initial phases in the results. Our current behavior, as you can see from the plots, obviously have much better initial phases.
Will keep you updated on both the SPLICE issue and being able to resolve with your context
@jigpu
If you're willing to update your branch to the latest main
content, I just pushed some improvements and fixed the issue you were experiencing. It was simply a unit/scaling problem when we interpolated time from SP3.
As expected from the quality of the signal sampling, the solutions are very good. Most certainly the best PVT solutions we have resolved with this tool so far. So I can only say thank you for sharing, and helping make this tool better. Stay tuned for future improvements (PPP and other simplifications) that will substantially improve the solver.
I have to tweak the settings a little bit, because 2 hours of data (8 SP3 data points), is incompatible with our default interpolation order of 11. I used this command line with this exact configuration:
cat rinex-cli/config/rtk/gpst_10sv_basic.json
{
"timescale": "GPST",
"interp_order": 6,
"max_sv": 10,
"solver": {
"gdop_threshold": 3.0
},
"modeling": {
"sv_clock_bias": true,
"sv_total_group_delay": true,
"relativistic_clock_bias": true,
"tropo_delay": true,
"iono_delay": true,
"earth_rotation": true
}
}
./target/release/rinex-cli \
-f /tmp/mcso0820.24o \
-f /tmp/mcso0820.24n \
-f /tmp/igr23065.sp3 \
-P GPS \ # as previously explained
-p --spp \ # PPP not available yet
-c rinex-cli/config/rtk/gpst_10sv_basic.json # force to custom settings
Such a narrow timeframe, yet the solver converges to great results. If you look at my previous post, we reached slightly poorer results by consuming 12 hours of data.
Side note: The position solver section of the wiki pages is in the process of being written, so don't worry about all of this at the moment. I'm in the process of concluding our pending topics (mainly Rinex fops) prior publishing a new release, don't worry about the couple of warnings that introduced by the latest contributions
Note to self: it's a bummer the SV elevation determination does not seem to work well; I really need to fix that. If it did, it will be a very simple yet efficient mean to get even better results.
Nice! I built the changes locally and can confirm the changes :+1:
I've also retried the 24o from my phone and was able to get a solution there as well. The solution is around 100m off with a whole lot of drift, but I don't suppose I should be too surprised given the source and quality of data :laughing: Its just 10 minutes of my phone sampling once-per-second sitting on an electrical box with half the sky under some degree of canopy. (EDIT: RTKLIB does quite a bit better with its solution of the same files in single-point mode and restricted to just GPS, clustering most of its points within ~10m of the expected spot. Still neat to finally see rinex-cli coming up with its own solution though and know that there are improvements yet to come! :smile: )
BTW, I'm working on a patch to better deal with the "4 1" issue, hopefully will have something to review in the coming days...
I'm interested in understanding how you managed to have your phone spit out RINEX though :) but that should probably come in another discussion, or even could an new entry in our Wiki ?
I'll give you some breadcrumbs to start with :slightly_smiling_face: Google introduced a GnssMeasurement API in Android 7, making support for the API optional (depending on hardware) through Android 9. Beginning with Android 10 hardware support for the API was supposed to be mandatory. While I'm not aware of any apps that actually use the API for anything "normal", Google has released a GnssLogger App which can be used to dump the data in either a Google-specific CSV format or RINEX. (The GPSTest app is also capable of dumping measurements as CSV for conversion to RINEX, with an open issue to add native RINEX support).
Google has more info at https://developer.android.com/develop/sensors-and-location/sensors/gnss for anyone interested.
This is so cool!
BTW, I'm working on a patch to better deal with the "4 1" issue, hopefully will have something to review in the coming days...
great to hear ! I'd be glad to review it and help you out.
EDIT: RTKLIB does quite a bit better with its solution of the same files in single-point mode and restricted to just GPS, clustering most of its points within ~10m of the expected spot
very interesting to hear. I'm not surprized. I presume RTKLib solely works on the so called "PPP" signal combination which is supposed to give a huge improvement almost "right away". I still have not introduced it yet, but in the end, I think pushing our SPP solver to the limit, with now the LSQ filter which is very close to a Kalman filter, was the best strategy. As you can see, we're still figuring a couple of scenarios that did not work, it helped stabilize the whole thing. Before introducing the combination, I should really make the effort to have the elevation and SNR calculations corrected. The elevation we estimate are sometimes totally wrong, it prevents taking this criteria into account (at least correctly).
Google has more info at https://developer.android.com/develop/sensors-and-location/sensors/gnss for anyone interested.
Thank you I'll look into this. Maybe in the end we could have an entry about this topic in our wiki pages
Hello @jigpu,
a quick update on what's new and rerunning tests on your data. apriori position is no longer required: we can perform surveys from scratch. apriori knowledge of the final solution simply enables the comparison plots. Refer to the RINEX Wiki for more information
Side note: SP3 and/or CLK RINEX can now be omitted in the solving process. We call this "BRDC" for Radio based navigation. Is is most likely incompatible with centimetric positioning, but we'll later see how we perform. It allows surveying with BeiDou and QZSS constellations from now on.
Rerunning on the data attached to this post with latest rinex-cli
:
./target/release/rinex-cli \
-f WORKSPACE/mcso0820.24o \
-f WORKSPACE/mcso0820.24n \
-f WORKSPACE/igr23065.sp3 \
-P GPS -p -c /rinex-cli/config/rtk/gpst_cpp_kf.json
Solution | x(err) | y(err) | z(err) |
---|---|---|---|
Final (15' average) | 6m | 6m | 69mm |
We're limited to GPS NAV. CPP strategy (best strategy as of today) can deploy since C1 (L1 civilian) and P2 (L2 precise) signals were sampled. With interpolation order 17 I have a little less than 15' of averaging, and achieve good results. Most likely due to the L2 precise signal, I don't think I have ever run a strategy that involved a precise signal yet. This setup is on par with Galileo performances. It's too bad we only have 2 hours of data, I can only imagine what it be with more observations.
For comparison, surveying of ESBJERG station with Galileo and 24h of observation:
./target/release/rinex-cli \
-f test_resources/CRNX/V3/ESBC00DNK_R_20201770000_01D_30S_MO.crx.gz \
-f test_resources/NAV/V3/ESBC00DNK_R_20201770000_01D_MN.rnx.gz \
-f test_resources/CLK/V3/GRG0MGXFIN_20201770000_01D_30S_CLK.CLK.gz \
-f test_resources/SP3/GRG0MGXFIN_20201770000_01D_15M_ORB.SP3.gz \
-P GAL -P C1C,C5Q -p -c-c rinex-cli/config/rtk/gpst_cpp_kf.json
Solution | x(err) | y(err) | z(err) |
---|---|---|---|
Final (20h average) | -260mm | 455mm | 163mm |
I've been experimenting with the position solving function of rinex-cli and have bumped into a problem when processing some Rinex files. In particular, I get the following panic from Rust:
Reproduction Steps
./target/debug/rinex-cli -f tmp/mcso0820.24o -f tmp/mcso0820.24n -f tmp/igr23065.sp3 positioning --spp
Notes
I've noticed that the issue seems to be with the very end of the 24o file. If I trim a few records from the end then no crash occurs. I'm particularly suspicious of the line immediately before the splice. I don't know anything about the Rinex format, but that line looks different than anything else in the vicinity. If I remove that single line then no crash occurs. Here's the section in question:
And an except from a different 24o file with the same problem: