vortex-5 / ddwrt-bwmon

An Individual Bandwidth Monitor For DD-WRT
171 stars 37 forks source link

Feature: Historical Bandwidth/Throughput Data #28

Closed boxintense closed 7 years ago

boxintense commented 8 years ago

It would be great if we can see a historical bandwidth data for each device in a graph (something like rrdtool graph).

vortex-5 commented 8 years ago

Historical data is complicated most because of how much more data we'd have to log in the usage stats. We've limited what kind of data to log on purpose to allow the more lower end routers to be able to handle this.

If we were logging bandwidth stats historically we'd basically have to keep a much larger database.

This is better met by things like YA-MON that is better at book keeping and is available on the DDWRT contributions.

It is more or less beyond the scope of this project although you're welcome to fork in order to add it if you wish. But in the near term this feature was considered but not planned to be included for the foreseeable future.

The emphasis was always on day to day use with minimal impact to performance over deep analysis since other tools seem to already satisfy that need.

boxintense commented 8 years ago

I see, that's perfectly understandable and I after re-thinking, this solution might not be a good idea after all if it overloads some of the lower-end routers in process.

YA-MON is a good tools, but it's has additional setup and configuration workflow, I want a simplified (non-configured) setup and interface that just works and ddwrt-bwmon give me exactly what I need.

Just in case you decided to implement this features, on our datacenter the rrdtools save the data on database or JSON format every 5 minutes (average usage for each 5 minutes), so for a single device, there are 288 data points each day, or 8,640/month. I'm not sure how "big" of a repercussions that a system like this will have on router's load, but if it's configured to do the client-based calculation (JS), I think it might be possible without overloading the routers.

vortex-5 commented 8 years ago

Well the main issue is that the bandwidth speeds are calculated in JS but the actual data used is done on the router since the browser is not connected most of the time. In fact if you are using lighttpd the load goes slightly up when you are actively viewing the page and lower when you are not on the page.

The router during lighttpd does one log entry every 60 seconds or once every 10 seconds for non lighttpd use.

Currently for non lighttpd users 2 samples are kept in memory the current usage stats and the usage stats from 10 seconds ago (this is used to calculate the bandwidth rates as we just look at the difference between two time points).

So if you have 40 users and the numbers are very high that can get to be about 40K * 2 worst case scenario.

There are reasons for doing this in ram since certain users choose to go against my recommendation and install this to a router's internal storage and the internal flash chips have limited write cycles. So in order to accommodate that we only save to persistent storage once every 15min. This means that during a reboot the worst case you will lose 15min of bandwidth data but the compromise is to brick your router early (most router flash chips don't have wear leveling built in so you end up prematurely killing it with many overwrites).

However that said if you wanted to keep 8640 samples at around 40K per sample that's 345.6MB for a month of logged data an the concern in this case isn't necessarily that the file size is large you can reduce the max logging to say a few hours. It's more that the router has to process the samples since now you're getting the router to maintain a database of usage either it has to string process line by line to add new samples which gets slower the longer it runs since it has to figure out. And then you probably want to time stamp things so you can figure out what time of day you are logging this as so you can see trends over time.

Finally if you have individual files saved on the drive that's easier on the router however now you have a folder with hundreds to thousands of files that if you wanted to show in the front end UI the browser now has to make 100 requests to fetch all these individual files which is now a performance penalty on the router since it's making 100 file requests for every 10 second update while you have the browser open.

And for long term database logging the only feasable solution would be to actually store individual files / entries so the router isn't spending all of it's time appending to the end of a very long string entry as that gets costly when it grows. So the main implication is fetching that data is very costly since separate requests have to be made. On lighttpd you can in theory concat all the files together into one php file which would allow you to have only 1 request to the router but the downside is the php script itself will probably do all this concat in memory temporarily before it can send it to you so in that situation your router would probably run out of ram.

So yeah it's a complicated problem potentially the easiest solution which is beyond the scope of this project is the script that calculates the usage stats can "push" the data to another server on your network or save to a database that is running on your LAN not on the router itself. If you do this you offload the storage and ram requirements over to the more powerful machine rather than the router. However web browsers can't do this so it would have to be either a file server or an SQL server or another type of server on your network that can take the data and will hold it for you. Then later on you can show graphs and that server can do the complex queries. The PC can then be responsible for complex queries and analytics analysis.

Routers these days have about 32MB of ram minimum and 256MB is available on the high end with some outliers like the synology system having 4GB of ram. In terms of ram available on a 32MB router you may have 17-24MB available so you do need to operate like you're in a memory tight situation.

vortex-5 commented 8 years ago

rrdtool sounds like it may already be working in a similar fashion to what would be required potentially someone more experienced with things of that nature can create a fork where periodically the script sends out the stats to a "holding server" so that you can perform the analysis that you have described.

vortex-5 commented 8 years ago

Sorry about the giant wall of text this is more a though process / planning doc at this point should this feature eventually be undertaken.

boxintense commented 8 years ago

Instead of saving it in multiple files, I'm thinking a simple database solution like sqlite, where all data is concatenated into a single database files which then get accessed via php (lighttpd). With these, there is only 1 files needed to be read by the router and there is no performance overhead.

But it's true, if the calculation of a 40K * 2 * 8640 = 345.6 MB worth of statistic data is needed, then most likely the router itself would be overloaded during the sqlite access (even if the router has swap available) even without doing any server-based (router-based) calculation, since those data is most likely stored in the router's RAM during the read phase.

One thing that is probably doable is limiting the data to 12 or 24 hours only (144 / 288 data points), which would results in somewhere around 11.5 / 23MB worth of statistics data. I personally doesn't think that much more data is required for a common router's usage. Sometimes we only need to see past few minutes, or data that we have just missed due to the 10 seconds interval refresh doesn't give us enough time to process it. Scrap that, after some thought, I think even 1 hour worth of historical data is more than enough :)

vortex-5 commented 8 years ago

In the current version you can change 10 seconds to whatever interval you want if you want an extended interval.

If you are running lighttpd right now the system takes 3 samples and averages the data usage out since data usage can be spikey but you can easily change that array size to be as big as you want so you can see sustained transfers "historical data" as you said and then it shouldn't be too hard for you to find a graphing library to show it. And the lighttpd version takes a sample every 2 seconds so you can have as much as you want to or your browser can support since it will be in memory on the client device showing it.

I don't think however the non lighttpd version will ever get this feature I don't see a way of doing it so there is no increase in cpu impact any of the methods will increase cpu usage.

SQlite is probably not built in support for ddwrt so even getting that to run is another question entirely.

Again this is probably not going to be a planned feature in this version and if it is added I can only see it added to the lighttpd version the non lighttpd version won't be getting this feature it's more complexity than I'm willing to tackle and maintain.

boxintense commented 8 years ago

I'll try installing lighttpd via optware now and report back. Thanks again for hearing out!

vortex-5 commented 8 years ago

I just confirmed that SQLite support is not native it's available on optware packages but many users include myself find that a hassle to get going so it's basically equivalent to not supported since I don't want to write a script that will try and bootstrap your optware since that can create a lot of problems for your ddwrt install. Not to mention each router seems to be different in getting that going.

vortex-5 commented 8 years ago

lighttpd is built into DDWRT if you are running a version later than 28xxx

vortex-5 commented 8 years ago

go to services --> webserver

vortex-5 commented 8 years ago

optware is not required as long as you have one of the more full featured DDWRT builds.

boxintense commented 8 years ago

Thanks for the hint,

Couldn't seem to find any "webserver" option, I'm on r29409

e1r1c7 build 29409 - services - google chrome 2016-04-12 10 40 15

vortex-5 commented 8 years ago

Then you would be on a more limited release ddwrt

boxintense commented 8 years ago

ouch that sucks! I think it's only for broadcom's based routers? Mine is atheros based (and mine doesn't support multiple vlan, double ouch!)

vortex-5 commented 8 years ago

Mine is atheros as well unfortunately which features are enabled on which routers I'm not sure how you can figure that out that's a question better left in for the DDWRT guys in their forum.

My cheaper atheros router doesn't have the web server option but then again it's on a tiny firmware and it's much more limited in general.

boxintense commented 8 years ago

Got an official answer, router with flash memory of 16 MB and lower (including mine) doesn't provide enough space for lighttpd.

vortex-5 commented 8 years ago

Ah thanks yeah I knew it was something like that.

euphoria360 commented 8 years ago

since my TP-Link wr842nd only has 8MB of NAND memory, I'm not fan of heavy files (I currently have only 2 MB free). But I think even 24 points hourly (or maybe 48) can be great. since OpenWRT and DD-WRT already have similar graphs (like this one) implementing it shouldn't need any extra package either.

I woould love to see some of users download behavior, when I'm sleep! :wink:

vortex-5 commented 8 years ago

Main issue is when it comes to per user. The aggregate bandwidth tracking isn't a big deal since it's all at once but the per user behaviour might be more problematic since effectively it's a graph per user.

The intiial simple thought was that you just copy the user stats with a timestamp or something similar at certain time points that way you can refer back to them and work out how much MB/s they are consuming over a longer period but eventually you'll have a directory full of snapshots that you'll need to assemble to get a graph.

In memory would be eaiser to implement but from the sounds of the requirement most users want something that logs it over time and over a long period even when they are not looking at the router.

I think it's possible to achieve it would be in the data folder you'll save copies of usage_stats.js over time the main issue is how do we go with a naming convention that makes sense and how might it be possible for the script to find all these files.

Alternatively you can run an SQL server but 2MB is probably not enough to work with for setting that up.

The main problem is once you have a folder with dated usage_stats.js how is the front end script going to lookup these scripts to present them on a timeline?

boxintense commented 8 years ago

naming convention that makes sense

I think per MAC Address would make the most sense here. It is highly unlikely that common users would spoof MAC Address.

once you have a folder with dated usage_stats.js how is the front end script going to lookup these scripts to present them on a timeline?

I think the easiest one would be to use jQuery.get('http://.......... but of course to make sure this method is applicable, the usage_stats.js should be accessible through client browser. After this, graphing should be done via javascript graphing plugins like chart.js to make sure that there is no increase in performance load on the router side.

valkala commented 8 years ago

Hi. Just wanted to add some thoughts to this requested feature. Historical data can be extremely useful for tracking purposes, etc.

First of all, thank you vortex-5 for this iteration of bandwidth monitoring. It's so much more reliable than anything else I've used.

Since historical data can take quite a lot of space, is it possible to add requirement for historical data option upon installation or within app setting. Such requirements like storage capacity and intervals (week, month, etc.).

I think having an option to turn such feature would be amazing if not great.

PS. I've used YAMon, unfortunately there are way too many issues with it and stops running at random.

vortex-5 commented 8 years ago

I've always emphasized reliability and performance over features which is why I haven't really started on historical bandwidth yet. I don't feel I have the skills to pull of something like that at the moment on all the various devices and still have it be as reliable and trouble free as it stands now.

When I have all the potential problem areas worked out in my head I might attempt it.

I kept this discussion going because if anyone wanted to fork the project and attempt to add this the suggested methods in this thread are useful to know since I think my proposed methods would work. I only feel it's more fragile than I'm willing to go with on a continuous basis for a 24/7 no reboot for a long time tool like I have it setup now.

valkala commented 8 years ago

Fair enough, either way, it's the most reliable bandwidth tool so far and is much appreciated!