Tomy2e / livebox-exporter

A prometheus exporter for Livebox
MIT License
8 stars 0 forks source link

Livebox 2gb/s #3

Closed mabed-fr closed 1 year ago

mabed-fr commented 1 year ago

Hello,

I just received my livebox 5 2gb/s (livebox UP 2 gbit with multiple 1gb port rj45)

When I perform my speedtest with two machines plugged into 2 different rj45 ports to confirm that my 2gb/s are there, I see this

image

That is 220mb/s which corresponds well to 2gb/s

But on my graphs I hardly exceed 1gb/s

image

Is the data not available?

Is there an error?

Mabed

mabed-fr commented 1 year ago

I just did it with two different machines again and two different download servers.

Same conclusion

I think the statistics of the livebox is software-restricted, I take a closer look

Tomy2e commented 1 year ago

Hi,

Your graphs are not easy to read but it's normal that it doesn't reach 2 gb/s. The exporter only exposes the current bandwidth per interface (remember each rj45 port of the Livebox is a separate interface: eth0, eth1, etc.).

If you have two interfaces that are at 1 gb/s, you will not see 2 gb/s with the current Prometheus query you use.

To get the total bandwidth, you need to sum the bandwidth usage of each interface (eth0+eth1+eth2+eth4, etc.). here the Prometheus query I use to do this: sum(interface_rx_mbits{interface=~"5GHz-Private_SSID|2.4GHz-Private_SSID|eth1|eth2|eth3|eth4"})

Here is an example of dashboard I built with Grafana that uses this: https://github.com/Tomy2e/livebox-exporter/wiki

Also, your speedtest need to last for at least 30 seconds as download/upload speeds are calculated using 30 seconds samples.

mabed-fr commented 1 year ago

Hi,

Your graphs are not easy to read but it's normal that it doesn't reach 2 gb/s. The exporter only exposes the current bandwidth per interface (remember each rj45 port of the Livebox is a separate interface: eth0, eth1, etc.).

If you have two interfaces that are at 1 gb/s, you will not see 2 gb/s with the current Prometheus query you use.

To get the total bandwidth, you need to sum the bandwidth usage of each interface (eth0+eth1+eth2+eth4, etc.). here the Prometheus query I use to do this: sum(interface_rx_mbits{interface=~"5GHz-Private_SSID|2.4GHz-Private_SSID|eth1|eth2|eth3|eth4"})

Here is an example of dashboard I built with Grafana that uses this: https://github.com/Tomy2e/livebox-exporter/wiki

Also, your speedtest need to last for at least 30 seconds as download/upload speeds are calculated using 30 seconds samples.

I partially agree. When is the GPON WAN interface that seems to be suitable?

Combining the eth1-4 interface seems wrong to me to know the speed to the internet because I can all transfer a file from my local NAS as well.

Another thing, I agree with you on the fact that on the eth1-4 interface I only see 1gb but not on the WAN interface which should display the total speed to the internet

According to my screenshots, the test lasts more than 30 seconds.

I will repeat the test with the double size

Regards

mabed-fr commented 1 year ago

I stay on my position thinking that the WAN_GPON interface (fiber port) should show more than 1gb/s

Tomy2e commented 1 year ago

I've also noticed that the GPON WAN interface has data that seems "random", that's why I personnally discard it.

It's exposed by the Livebox API endpoint that I use and it's processed like the other interfaces so I'm not sure why this happens.

I agree with you it's unfortunate that we can't see the current bandwidth on the WAN interface, summing the LAN interfaces is just a "hack" and it won't be accurate if you have a lot of local traffic.

I will work on improving this and let you know when this is available.

mabed-fr commented 1 year ago

Thank you for the analysis and qualification in BUG, ​​I can help you on the system/network/unity aspects but unfortunately not at your level of development.

Waiting for your return

Tomy2e commented 1 year ago

The "hardest" part is reverse engineering the API of the Livebox but thankfully there is already a lot of resources on the subject:

I implemented two new ways to fetch the current bandwidth usage in this branch and removed the previous method that had data I didn't know how to process.

The first method I implemented uses the HomeLan.Interface.%s.Stats service. Data is updated by the Livebox only every 30 seconds and allows you to use these metrics:

The second method uses NeMo.Intf.%s:getNetDevStats. Data is constantly updated but needs to be pulled at least once every 3 to 10 seconds as the counters are reset once they reach a certain amount. It allows you to use these metrics:

In both versions, the WAN interface will be veip0 but this can change depending on the Livebox.

Ultimately, livebox_interface_netdev_* and livebox_interface_* give the same information but calculated differently. Do not hesitate to try and give me some feedback on which one you think is better/more accurate as I would like to only keep one of them in the final version. I find that graphs using the livebox_interface_netdev_* can have a lot of spikes (especially on the WAN interface) so they can be a little bit harder to read compared to livebox_interface_*.

Also, if we keep the first method, I could just remove the *_mbits metrics as it should be possible to have the same result by applying the rate() function of Prometheus on the *_total metrics.

To try this new version you can build the Docker image yourself or use this one ghcr.io/tomy2e/livebox-exporter:dev.

mabed-fr commented 1 year ago

Thank you for the detail.

Result of my test after move from latest to DEV and clean prometheus data

Same test:

1 INTEL NUC 1gbps eth1 1 DELL LAPTOP 1gbps eth2 aria2 command aria2c -x 16 -s 16 https://bouygues.testdebit.info/10G.iso in same time Exporter 1s grafana polling prometheus 1s

Result:

NUC: b47eb1|OK | 109MiB/s|/root/10G.iso

Sum 109 MiB --> 114 MB --> ~ 912megabit

LAPTOP 6c600d|OK | 92MiB/s|/root/10G.iso

92 MiB -> 96 MB --> ~ 736megabit

Total ~ 1.6 Gb

image image

The best for me is NETDEV.

What do you think ?

Thanks again for the detail

mabed-fr commented 1 year ago

There are still anomalies

image

Tomy2e commented 1 year ago

The netdev metrics are updated more frequently and by averaging the used bandwidth over a shorter period of time so it's normal that they perform better and faster. I'm still surprised how bad the other metrics were. Is it better if you run your tests for 5 minutes ? Are the other interfaces accurate or is it only the veip0 interface?

I also observed the spikes on the veip0 interface, however I'm not sure it's a bug in the exporter but rather the Livebox is sending some garbage data. The data is processed exactly the same way on the other interfaces but I don't observe these spikes. There are other methods I can try to use to get current usage on the WAN interface and see if it's better. Otherwise I will try to update the exporter to ignore these spikes but I'm not sure it's possible to get very accurate data from the Livebox.

mabed-fr commented 1 year ago

Hi,

After many tests the method available on the DEV image does not really reflect the bandwidth.

So I'm going back to the LATEST

By analyzing I may have just found something.

WAN_GPON-Multiservices & WAN_GPON is the best values

I launch one test (1gb) i get this: (11am) (11:05am)

image

Purple: WAN_GPON-Multiservices Indigo: WAN_GPON result test1: 925 Mbits

I launch two test (2gb) i get this: (11:10am) (11:15am)

image

result test1: 925 Mbits test2: 926 Mbits

Purple: WAN_GPON-Multiservices Indigo: WAN_GPON

I come back to DEV image an run the same test

mabed-fr commented 1 year ago

The result is not smooth

image

I think the data exposed by the livebox are just not the right ones

I'm going to export it with the DEV image and the other with the latest image and will watch as the firmware updates progress if that changes anything

I remain available if you want to try something else

mabed-fr commented 1 year ago

image

Tomy2e commented 1 year ago

I didn't have much time to look at this but I just wanted to share this with you. I was not at home for a few days and left the exporter running (dev image). It seems there was a power outage last night, and something surprising happened when power came back:

stats

After almost 3 hours of power outage, the Livebox restarted itself and no more spikes on the netdev metrics, everything is running smoothly

mabed-fr commented 1 year ago

I try a reboot soon

mabed-fr commented 1 year ago

Hello,

I have shutdown 30mn

It's great:

DEV image

I create a cronjob for reboot livebox soon

Can you merge to latest ?

Thank you

mabed-fr commented 1 year ago

I set up an automatic restart CRON (4am) on March 13, I'll come back to you if it helps to keep consistent graphs

Tomy2e commented 1 year ago

Hello, I updated the dev version of the exporter: docker pull ghcr.io/tomy2e/livebox-exporter:dev2

This new version includes a new set of metrics (livebox_wan_*) that we can try, it uses another endpoint of the Livebox to get WAN metrics. However, I didn't find a solution to get accurate WAN metrics without rebooting the Livebox.

Also, all of the new metrics that I added are now considered experimental and need to be manually enabled (see the README.md).

You can use this flag -experimental=livebox_interface_homelan,livebox_interface_netdev,livebox_wan if you want to enable all experimental metrics.

I'll merge this next week and release a new version.

Thanks for you feedback!

Tomy2e commented 1 year ago

Release v0.2.0 is now available: docker pull ghcr.io/tomy2e/livebox-exporter:v0.2.0

I'm closing the issue as it seems we can't get more accurate data from the Livebox for WAN traffic.