sparky8512 / starlink-grpc-tools

Random scripts and other bits for interacting with the SpaceX Starlink user terminal hardware
The Unlicense
482 stars 64 forks source link

InfluxDB optional parameter to transform true->1 and false->0 ? #26

Closed StephenShamakian closed 3 years ago

StephenShamakian commented 3 years ago

Hello @sparky8512,

I was curious, since InfluxDB/Grafana is more friendly to null/numeric values when it comes to graphing and value mappings in Grafana. Would it be possible to add a optional parameter to convert true to 1 and false to 0 on load into InfluxDB?

See this for an example of the issue: https://stackoverflow.com/questions/60669691/boolean-to-text-mapping-in-grafana I can create value mappings in Grafana based on int values but not string values.

Thanks!

sparky8512 commented 3 years ago

That should be easy enough to do. I have a couple other output tweaks I've been wanting to make, too, so I'll try to find some time this weekend to throw those together.

StephenShamakian commented 3 years ago

@sparky8512 That would be outstanding! Thank you! As I can then graph at least the 1/0's and if I needed strings I can always do a Grafana value mapping of the int value to a string.

Btw the reason I ask is I'm trying to build a graph that mimics what the actual starlink graph shows for obstruction/no satellites/beta downtime (aka 'other' in the Starlink code). I found this logic in Starlink's JS code (I re-wrote it into pseudo style code to document it, as un-minified JS isn't the easiest thing to read): image

This is the Dashboard I have currently (it's based on your json Grafana export in this GitHub repo). I'm willing to send it via a PR if you want to include it with your existing dashboard json so others can use it also. It just requires two additional Grafana community based panel plugins (radar graph, and Ajax Panel): image

Thanks Again!

sparky8512 commented 3 years ago

Btw the reason I ask is I'm trying to build a graph that mimics what the actual starlink graph shows for obstruction/no satellites/beta downtime (aka 'other' in the Starlink code).

I had a feeling the booleans might be a problem for that, but never dug into it myself.

I found this logic in Starlink's JS code ...

You'll want ping drop >= 1, as it should never actually be > 1. I think the JS code uses >= 0.99 or something like that, but in practice, I've never seen it over 0.99 but not actually 1, so it may just be trying to account for floating point rounding error. Other than that, the logic looks correct to my understanding of what the Starlink app is doing for those stats. I had meant to write up a wiki article that would include this info, but I just haven't been in the right frame of mind for writing lately.

This is the Dashboard I have currently (it's based on your json Grafana export in this GitHub repo).

Actually, the one in the depo is from @neurocis. And there's another one posted on the wiki page. I've never messed around with Grafana myself.

StephenShamakian commented 3 years ago

@sparky8512 Yup, you are 100% correct on the pingDropRate. I just rushed typing something up at the time for a quick reference nothing actually meant to work. But I do need to update it to >= 0.99 as that's exactly what is in the JS code.

Regarding the dashboard, oh cool! Thanks for the info. I'm happy to share mine as well once I'm done tweaking it.

neurocis commented 3 years ago

Once merged I'll bump the docker hub image. Cheers.

sparky8512 commented 3 years ago

Change d603272d90d666b2d3051a4ea3ad0c0f3c0c0e92 adds the option for this.

The new command line option is -N or --numeric, and will affect all boolean values in the output, whether they are single values or sequences, for all output scripts (not just the InfluxDB one). However, there is a caveat, which I noted in the change description:

WARNING: Use or non-use of this option with the database output scripts will change the schema of the data. sqlite doesn't care about that, because it stores booleans as integers, anyway, but InfluxDB will trip an error if you try to record data points with this option to a database that has data point recorded without it, or vice versa.

StephenShamakian commented 3 years ago

Thanks @sparky8512 this will be a huge help in graphing this data! I'm not concerned with the previous dataset after making this change. I can just wipe the Influx DB out and start fresh again.

One thing I noticed, the isScheduled is only logged in the history status group. Do you see any issues logging that status group every couple of seconds into influxDb with the amount of data in it?

@neurocis Once the docker container has been updated I will give the new code a test spin. :)

sparky8512 commented 3 years ago

One thing I noticed, the isScheduled is only logged in the history status group. Do you see any issues logging that status group every couple of seconds into influxDb with the amount of data in it?

You mean the bulk_history group? That's the one with the scheduled field.

It's not really necessary to log that often, because the way the bulk history mode works (by default) is that it keeps track of what the last logged counter value was and will only add new data points, so as long as you log more frequently than every 12 hours, it will be the same amount of data in total. I tested this on an InfluxDB 2.x server and it was about 4MB per day (see my ramblings on issue #5).

That being said, you don't want to go that long between polling the history buffer, because it gets lost when the dish reboots. The dish seems to be OK with polling it every few seconds, as that's what the Starlink app does when you sit on the stats page, but it will add some to the network load on your LAN, since the entire 12 hours of history data has to be pulled each time, regardless of how many data points are new. There have also been some performance issues with the script polling very frequently (see issue #22), but I think the major one has been addressed. I personally poll history stuff once per minute.

If you mean the ping_stats group instead, which has the same data but probably harder to further process because it's already processed some, that would be less data the less often it was polled, but the dish and network load would be the same, as it works off the same raw history data as bulk_data.

StephenShamakian commented 3 years ago

@sparky8512

Thanks!

So what I'm hearing is the bulk_history group should only be polled roughly every 1 minute or so? As the other groups I query every 3 seconds (same update speed as the Starlink web app). As I want the same resolution & speed in statistics that the Starlink web app has. So I may have to setup a separate container instance for the history load to run every 1 minute vs. every 3 seconds?

The only place I found the (unprocessed/live) scheduled flag was in the bulk_history group. There was a total count of unscheduled in ping_stats. But that doesn't help for graphing over a time period of when the scheduled/unscheduled periods occurred?

sparky8512 commented 3 years ago

So what I'm hearing is the bulk_history group should only be polled roughly every 1 minute or so?

Ehhhh... You might as well give it a try at every few seconds and just keep an eye out for performance or networking problems for a bit. I tend to be overly cautious about stuff like this.

So I may have to setup a separate container instance for the history load to run every 1 minute vs. every 3 seconds?

FYI: This doesn't necessarily need separate containers, just separate script instances running. But to do that within a single container, you'd need to either run a different start script or start a new script instance in a running container via docker container exec command. But really, you should just try it with everything running every few seconds first if your goal is to achieve similar results as the Starlink app.

The only place I found the (unprocessed/live) scheduled flag was in the bulk_history group. There was a total count of unscheduled in ping_stats. But that doesn't help for graphing over a time period of when the scheduled/unscheduled periods occurred?

That's right. The ping_drop group is largely about the "Last X hours" summary stats from the Starlink app's stats page, but that can also be computed from the bulk data if you're recording that. The graph parts of the app's stats page could only be duplicated from the data in the bulk_history group.

neurocis commented 3 years ago

@sparky8512 dockerhub image refreshed to latest.