LibreQoE / LibreQoS

A Quality of Experience and Smart Queue Management system for ISPs. Leverage CAKE to improve network responsiveness, enforce bandwidth plans, and reduce bufferbloat.
https://libreqos.io/
GNU General Public License v2.0
413 stars 46 forks source link

Release Checklist #229

Open dtaht opened 1 year ago

dtaht commented 1 year ago

Checklist items on code passes

Testing

Writing

Outreach plan

thebracket commented 1 year ago

Rust dependencies all updated to latest, tweaks made to code where necessary. The GitHub CI now checks for CVEs and obsolete packages as part of the continuous integration run.

interduo commented 1 year ago

In my opinion alpha version is ready to release (as we use current master branch state in production and it really do well work now). This issue should be renamed to "Beta release checklist" after alpha comes out.

It's better to release alpha early and often next versions than doing really big milestones. What do You think?

dtaht commented 1 year ago

Mentally I have targetted an alpha release for the end of the month. I would like the vast majority of this checklist to have gone through by then. Also I have an increasing desire to move stuff out of Ispconfig.py and into the toml, where it would share the configuration for the bridge with the rust, and the setup simplified more for new users. Lastly, I would like to find and on-board at least two new users to find the things that those with experience with the product aren't finding before declaring that state. In the latter two cases, it is not the existing users' I care so much about, but the costs of supporting and on-boarding the next 100, or 1000, and everything we can do to improve usability before the alpha or final release, with the resources available will pay off down the line.

I am glad that the code is considered stable enough to be in production.

I have mostly been focusing on the math (which has some problems), a decent sim of real RTTs, and the netlink/sampling problems, none of which are barriers to the alpha. Some of the items on the checklist, look easy, like coping with licensing issues and verifying the python is up to date: @rchac ?

As always I seek consensus on all we do or plan, and we have a meeting this thursday 1PM PST to discuss the remaining 31 open issues here: https://github.com/LibreQoE/LibreQoS/issues?q=is%3Aopen+is%3Aissue+milestone%3Av1.4 which we can punt, modify, or fix.

thebracket commented 1 year ago

I did a global cargo fmt run, so the Rust side is formatted consistently (yes, it recurses).

I've moved a couple of trivial locks to atomics, for a not-really-measurable performance change (an uncontested mutex lock is approximately 13 nanoseconds in userspace on an 8 core AMD Fx at 3.6 ghz; that's VERY hard to beat).

I've abandoned the effort to use lock-free structures because they are consistently outperformed by locks in the benchmarking I performed. An RwLock wrapped update of the TC queue statistics structure was consistently faster than a similar update in a lock-free structure (tested with DashMap and Crossbeam's SkipMap - the latter has horrible usage semantics). I saw a similar lack of improvement for unlocking per-host throughput data.

A few minutes ago an advisory hit about Rocket; I'll update when the fix exists. It's pretty trivial and doesn't seem to make us vulnerable to anything. The audit system alerted me to it.

thebracket commented 1 year ago

I've put up a PR that checks for GPL3 in the Rust side of things. There isn't any. I haven't looked at the Python side. (PR https://github.com/LibreQoE/LibreQoS/pull/292 )

thebracket commented 1 year ago

For the other items:

So I've done the parts of the "Checklist items on code passes" that make sense. The rest read more like an abstract guide on code creation, minus the parts about not optimizing things that don't matter.

dtaht commented 1 year ago

I appreciate all your pithy comments. Try to remember that we will one day on board devs far less experienced than us, and rather than hanging over every line of commit, I like to have deeply embedded the basic checklist items such as these. Someday, perhaps, there will be more.

Pieces of feedback on your feedback:

0) Asking that the checklist be checked off, is something that has to happen on every major release. It is an assurance, to those not deep in the code, that the dev has and is sure those things have been dealt with.

1) "Does it make sense to use double or floats", is related to the loss of precision that might happen if those are used, so I was asking essentially, did anything need to be a double, based on the data we were aggregating? We presently get data that has a dynamic range of billions (nsec to sec), which is outside the precision of normal floats.

2) strace on the whole application(s), lets us see what system calls are used and make doubly, extra, super sure, that all possible error returns are successfully coped with.

2.1) EAGAIN, EBUSY, etc can and will bite you on high speed interactions with the kernel. I am merely going to wait until it bites you as hard as it has bitten me, before stressing on this point anymore.

3) We have not actually tested a real workload any greater than what is deployed in the field. This is hung up on me constructing a decent enough sim.

4) In C, dealing with many heap allocs and frees, is always begging for trouble. It is always faster to parse a string down to a real value.

I agree, that presently, we are collecting and keeping around too much useless data. Also, ideally we move away from parsing system tool output into more directly programming the kernel.

5) My principal use for log_once is inside of large loops that might throw a ton of errors, which will permute the concurrency of other operations. More than once in this process, spamming the log has caused other problems.

6) Simd is merely a look-at thing, in terms of structuring data and code so that it could be parallized if necessary.

Aside from that, we can work on coding guidelines and other items to try and

Very happy to see you take the time to review this checklist, and express your feelings about it!

dtaht commented 1 year ago

Moving to two items I did not check off:

It is nice to have a grip on structure sizes. Traditionally I tried to construct something that took every structure we had and showed the size of it. Even more so, it is nice to have a grip on possible memory leaks and the why of their growth patterns, and from what they may be coming from.

As for tracking heap allocations, well, you just ran into that problem in the chrome bug you have been coping with with. When you have a program that needs to run without leaks, for months at a time, even the smallest leak, will bite you.

Both of these are just nice to haves at this point.

thebracket commented 1 year ago

One of the reasons I adopted Jem allocator is that it has a lot of tooling built in. I'll see about grabbing some output from it, eg https://gist.github.com/ordian/928dc2bd45022cddd547528f64db9174

Strings are mostly a problem at the C boundaries in Rust. Once in a Rust string, it's a smart pointer that stores length (null termination is awful, Wirth solved it in the 70s...). It deallocates as soon as it goes out of scope (same mechanism as a C++ destructor) - and dangling pointers, use after free literally won't compile. So they are a performance concern, but not a safety issue anymore. (Memory leaks are not part of Rust's safety guarantee, but you really have to work to make them by accident)

On Sat, Mar 25, 2023, 2:09 PM Dave Täht @.***> wrote:

Moving to two items I did not check off:

It is nice to have a grip on structure sizes. Traditionaly I tried to construct something that took every structure we had and showed the size of it. Even more so, it is nice to have a grip on possible memory leaks and the why of their growth patterns, and from what they may be coming from.

As for tracking heap allocations, well, you just ran into that problem in the chrome bug you have been coping with with. When you have a program that needs to run without leaks, for months at a time, even the smallest leak, will bite you.

— Reply to this email directly, view it on GitHub https://github.com/LibreQoE/LibreQoS/issues/229#issuecomment-1483899643, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADRU437EY3N3STJ3QXMRAPTW547FXANCNFSM6AAAAAAUICTRO4 . You are receiving this because you commented.Message ID: @.***>

interduo commented 7 months ago

What is release checklist for v1.5 and what is "the date"?:)

thebracket commented 7 months ago

When it's done! We don't have a formal list at this time. I hope to include (not complete):

interduo commented 3 months ago

What is the blocker for releasing v1.5rc1?

thebracket commented 3 months ago

Several merges, some testing and a handful of minor bugs in the configuration system. That, and we have day jobs!

interduo commented 3 months ago

Several merges, some testing and a handful of minor bugs in the configuration system. That, and we have day jobs!

Deb package build for develop branch are available (daily/ondemand/commit)?