sparky8512 / starlink-grpc-tools

Random scripts and other bits for interacting with the SpaceX Starlink user terminal hardware
The Unlicense
482 stars 64 forks source link

How does the Bandwidth usage works ? #84

Closed cluz closed 6 months ago

cluz commented 1 year ago

Hello,

I'm collecting the statistics of several starlinks through the dish_grpc_text.py script, that interrogate my units every 30 seconds.

When I sum the daily usage reported by the script metrics, I've got around 10% less of what starlink website or my router have measured.

Would you have more detail on how the download_usage and upload_usage metrics works ? Do I need to increase my sample period ?

sparky8512 commented 1 year ago

Assuming you're using one of the history-based data modes, which includes usage, sample period should not matter as long as it is less than the history buffer size, which is 15 minutes.

We don't get any technical detail on what these metrics mean beyond the names in the gRPC protocol, and how the Starlink mobile app describes them. In the case of download_usage and upload_usage, those are computed by totaling the underlying per-second history data (downlink_throughput_bps and uplink_throughput_bps) and converting from bits to bytes. downlink_throughput_bps and uplink_throughput_bps are what populate the Usage portion of the Statistics page in the app.

It's been a while since I looked at those in detail, but from what I recall, they seemed pretty accurate, but there was a peculiarity: One direction seemed to include the periodic ping traffic, but the other did not. Instead, the other direction seemed to include the gRPC traffic from LAN to the dish, but the first direction did not. I don't know if that's still the case, though, or if I was even interpreting it correctly back then. That being said, I would not expect that to account for a 10% discrepancy, unless you normally use very little data, and even then, I would expect the dish stats to overreport, not underreport vs the daily usage data on the Starlink website.

It could account for a discrepancy with your router, though, as your router would likely include both directions of the LAN <-> dish traffic, and that would include the entire history buffer every time the script polls. That's... still not that much, though. For a 30 second polling interval, that would come out to about 45MB per day.

cluz commented 1 year ago

Hello,

Thanks for your reply, here are for example the differences I observe for 24 hours:

At the same time our router will measure 11,9GB for 11,8GB indicated by starlink but this behavior has been confirmed by other colleagues, starlink indicates slightly less than what fortigate or cisco measures.

sparky8512 commented 1 year ago

Hmmm... seems pretty suspicious that the router's reporting would be so close, but the dish-reported number is so off. My best guess is that the router (and website) is including some overhead that the dish is not, but I wouldn't think that would amount to that much of a discrepancy.

I'm going to run some tests with my dish to see how it matches up with what my router reports.

sparky8512 commented 1 year ago

I need to collect a few more days of data to confirm, but so far, it seems my router is underreporting data vs either what the dish is reporting via gRPC or what the Starlink website is reporting. However, the dish is reporting pretty close to the website numbers, just a little below them.

Assuming the website is reporting usage from midnight local time to midnight local time the next day, it combines download and upload usage, and that it is reporting in binary GB (230), not decimal GB (109), here are the reported numbers from my dish and router:

So my router is underreporting by 8-10%. I'm guessing it is just not accounting correctly or I'm misinterpreting the stats. I could probably track down what's going on with that, but it doesn't seem as important as the dish vs website numbers.

The dish reporting is within 1-2% of the website numbers, which I could easily see being some overhead the dish is not counting or does not know about. For example, the website might be counting download retransmits for packets the dish did not get due to obstructions or other signal issue.

Something else I noticed is that for the weeks I was out of town and the only thing running on my LAN was the router, the website was counting about 0.07GB usage per day. Since the router should have only been doing DHCP and NTP traffic on the WAN during that time, I'm guessing that's mostly traffic from the pings the dish uses for stats collection. I have not confirmed whether or not that shows up in the dish reporting.

Some caveats:

cluz commented 1 year ago

Hello,

Thanks for your reply, good to hear that you have accurate measurements so it means that the issue is on my side. Can you confirm which command you use so I can start by using the same setup than you ?

sparky8512 commented 1 year ago

The exact command I'm using for this test run is:

python dish_grpc_text.py -t 60 -O usage_stats.csv usage

Then importing the data into a spreadsheet, converting the time from UTC to local time, totaling download_usage and upload_usage for each day, adding those together, then dividing by 1073741824.

cluz commented 1 year ago

Thanks, I will do tests this week.

sparky8512 commented 1 year ago

With a few more days of data, my dish still appears to be pretty close to the website reported usage stats. Here are the updated numbers (including days listed in the prior comment):

Of note is that on Oct 9, the dish actually overreported vs website, and by the exact amount (within precision of the website reporting) that it underreported vs website on Oct 8. That may just be a coincidence, but it may also indicate a misalignment between when I was starting my day vs when the website did. Since my numbers should all be within 1 minute of midnight of an NTP-synced clock, and I generally don't have much traffic at that time, it could be that the website is pulling stats at a slightly irregular interval.

Anyway, that concludes my data gathering for now.

cluz commented 1 year ago

Thanks, script is running on my side, I will analyze it by Monday.

cluz commented 1 year ago

Hello,

Sorry for the delay, so here is what I get with your exact script:

Oct 13: website 15.42GB, dish 17,45GB, router 16,85GB
Oct 14: website 15,47GB, dish 16,31GB, router 16,08GB
Oct 15: website 13,89GB, dish 14,60GB, router 13,64GB
Oct 16: website 17,09GB, dish 18,46GB, router 17,85GB
Oct 17: website 19,03GB, dish 20,29GB, router 19,62GB

I've got the following fields with my csv: date, latency, aaa, bbb, ccc ... ... I have supposed that upload and download are bbb and ccc ?

sparky8512 commented 1 year ago

I think it's datetime (in UTC), number of samples processed, end sample counter value, download usage, upload usage.

You can output a CSV header using the -H command line option. Detailed explanation of each field can be found in the doc comment at the top of starlink_grpc.py.

sparky8512 commented 6 months ago

I don't think there's anything left to do on this, so I'm going to close it out. Feel free to re-open if you had any further questions on this topic.