reMarkable / linux

Linux kernel for reMarkable 1 & 2. zero-sugar is rM2 and zero-gravitas is rM1.
Other
270 stars 41 forks source link

Kernel config suggestions #8

Open muke101 opened 3 years ago

muke101 commented 3 years ago

Hi, I compiled a custom kernel for my reMarkable 1 and was talking about the changes to the configs I made in the discord. It was suggested I share them here, as they may be useful to include in the default kernel to improve performance, battery life or both.

The following changes I see no reason not to have on all kernels:

I've made some more changes that I'm slightly uncertain about for enabling by default, but will share them anyway:

I'm an armature coming at this with little experience, so forgive me if everything here has already been considered and accounted for, but I figured if there was a chance even one of these things hadn't been thought of in the existing config it could be useful to flag up! I will say, though I haven't benchmarked regular usage performance or screen latency, I have recorded an almost exactly three second decrease in boot time with these settings, which I think is fairly significant.

cweagans commented 3 years ago

@larsim are there any plans to implement some of these changes in the "official" kernel for the reMarkable? Would you be more likely to incorporate these changes if a PR were opened? It would be great to not have to compile and install a custom kernel to get the performance and battery life improvements listed above!

larsim commented 3 years ago

@muke101 Thank you for this! @cweagans Yes, we plan to test some of these changes and possibly incorporate them, and we are always looking to improve our kernel. I'm also interested if you have other improvements that you'd suggest I look into while I'm at it.

LinusCDE commented 3 years ago

One really cool addition would also be to have uinput in the kernel by default. I don't think that the additional size would be noticeable to most while solving a plethora of challenges for certain custom software.

Examples:

The fuse module would probably also be nice for some people but is of lesser importance for me personally and I don't see as much usecases for it.

As said this is just a suggestion. I fully understand if this is not something that is deemed useful for the device and therefore not added.

Edit: Fixed grammar

loicpoulain commented 3 years ago

The following changes I see no reason not to have on all kernels:

* Disable high memory support - the default config enables support for memory addresses greater than 4GB. This seems like an over sight, and disabling it yields a faster kernel.

I agree.

* Trim unused kernel symbols - this makes the kernel smaller and gives greater compiler optimization opportunity on the kernel source. Some external kernel modules require the ones that are trimmed, but I've trimmed them in my kernel and my rm1 works fine. I obviously can't be certain myself but it's fair to assume this is at least worth looking at. It should be mentioned I haven't found time to build a kernel that includes the proprietary wifi driver yet though, so for all I know perhaps this relies on something from here.

* Disable SLUB debugging - this adds to the kernel binary size and serves no purpose

Ideally, every debug stuff should be disabled.

I've made some more changes that I'm slightly uncertain about for enabling by default, but will share them anyway:

* Disable watchdog timer - this takes up memory and CPU cycles in the background, taking up usage and draining battery. Supposedly though it might be useful for breaking out of bootloops, but I'm not sure how much of an issue this is on the rm1. If it's not, this should definitely be considered for removal.

watchdog is a security mechanism to prevent the unbounded hanging of the system, it's a must-have feature on these devices. Moreover, the overhead is insignificant, something like few CPU cycles every ~30s when the device is running, that's not executed when the device is sleeping,

* Optimize very likely/unlikely branches - this should make the kernel run faster most of the time and very occasionally run slower on some operations. I decided to make the trade off, I'm not sure which might be preferred for default usage though.

Can you elaborate here, what do you want to change? add more branch-prediction helper calls?

* 1khz over 100hz frequency timer - this makes a trade off between latency to responding to hardware interrupts and battery life as well as the amount of cache space the kernel takes up. I'm less sure if this is really worth it, but I think it's safe to assume drawing on the screen includes a hardware interrupt here, and so might even reduce latency? I'm unable to benchmark it, but I thought would be worth mentioning here anyway. Obviously needs to be taken into account with whatever, if even slight, reductions in battery life too. There are also middle grounds between these two extremes.

You should not worry too much about CONFIG_HZ, interrupt are not really impacted by that, hard interrupt handlers are executed 'synchronously' and so do not depend on that value, 'threaded' interrupts handlers are executed with SCHED_FIFO algorithm making them running fast after the interrupt occurred. CONFIG_HZ mainly impacts the time wheel accuracy, but most of the time it's not so important. The other impact could be the task timeslice being too large, but it's in part fixed by CONFIG_SCHED_HRTICK that causes the kernel to rely on high-resolution timer for fair scheduling.

* Patched for real time preemption - the default kernel is set at involuntary preemption, or 'low latency desktop'. This is a trade off between responsiveness and max CPU throughput. I figured in a similar fashion to the timer frequency, enabling even more preemption should reduce latency, and in this case not even specifically to hardware interrupts but to all user space software, especially when the CPU is under load. Obviously though this is a separate patch that needs to be applied, and may not even be worth it at all, but once again thought it would be worth mentioning.

I would say such a device does not host real-time applications, so I'm not really sure what it could improve for the user. Sure it will bring 'bounded latency', but will also degrade overall performances. One problem is also that RT-patched Linux is far less tested than mainline Linux, and not sure that downstream drivers integrated for this project have been well tested with RT patches. RT patches make substantial modifications (e.g. converting spinlock to mutex) that may have not been taken into account by all these drivers, and so can bring a wide new range of issues.

* Switch from LZO compression to LZ4 - This makes the kernel slightly bigger but decreases boot times. As it's already at LZO I figured it's assumed the additional speed isn't worth the additional space, but I decided to make this trade off myself anyway.

Yes, that should probably be worth testing that.

I'm an armature coming at this with little experience, so forgive me if everything here has already been considered and accounted for, but I figured if there was a chance even one of these things hadn't been thought of in the existing config it could be useful to flag up! I will say, though I haven't benchmarked regular usage performance or screen latency, I have recorded an almost exactly three second decrease in boot time with these settings, which I think is fairly significant.

Thanks for all your ideas, we are looking at them, and some changes have already been applied internally.

Eeems commented 3 years ago

Thanks for all your ideas, we are looking at them, and some changes have already been applied to Linux 5.4: https://github.com/reMarkable/linux-internal/pull/148

It looks like that's a private repository.

muke101 commented 3 years ago
* Optimize very likely/unlikely branches - this should make the kernel run faster most of the time and very occasionally run slower on some operations. I decided to make the trade off, I'm not sure which might be preferred for default usage though.

Can you elaborate here, what do you want to change? add more branch-prediction helper calls?

'Optimize very likely/unlikely branches' is a kernel config option. It's just provides hints to gcc, making things slightly faster when it's profiling is right and slightly slower when it's not, which should be the minority case. Maybe you can argue negativity bias might make disabling it preferable but I think it's worth considering.

Thanks for all your ideas, we are looking at them, and some changes have already been applied internally.

This is really cool to hear, glad I could help!

Etn40ff commented 2 years ago

One of the suggested changes, i.e. trim unused symbols, has been implemented with commit 6df0622bc8544ae31b8f97615c47707cea83ff9f

Unfortunately, because of this change it is now impossible to load custom modules. Could this change be reverted?

loicpoulain commented 2 years ago

Unfortunately, because of this change it is now impossible to load custom modules. Could this change be reverted?

Can't you rebuild your own kernel with required config changes to load your module(s)?

Eeems commented 2 years ago

Unfortunately, because of this change it is now impossible to load custom modules. Could this change be reverted?

Can't you rebuild your own kernel with required config changes to load your module(s)?

Yes, but that can be risky, and makes it harder to provide custom software to people that requires it, as it's a lot harder to ask them to replace the kernel on their device.

cgevans commented 2 years ago

I'd similarly ask if you'd consider reverting the TRIM_UNUSED_KSYMS change. It has significantly degraded my experience with my rM2. It breaks almost all ability to add extra modules: something as simple as needing the device to connect to a VPN now requires completely rebuilding a new kernel, with all the risks that this entails (particularly on the rM2), rather than simply compiling a module and copying over a few files, eg, for Wireguard or tun/tap interfaces.

The kernel config documentation generally recommends against enabling the option, and doesn't even display it in configuration unless expert mode is enabled, noting that it's for 'specialized environments that can tolerate a "non-standard" kernel'. It really does cripple the device for some users, for seemingly minimal benefit. Having it disabled is not a debugging option: keeping ksyms is the standard.

Etn40ff commented 2 years ago

Thank you https://github.com/reMarkable/linux/commit/80c68123b89fbd95deedfc6fddaa57de58b5ce3a

loicpoulain commented 1 year ago

I would suggest closing this issue now.