Rahix / avr-device

Register access crate for AVR microcontrollers
Apache License 2.0
170 stars 66 forks source link

Investigate crate-size optimizations #59

Open Rahix opened 3 years ago

Rahix commented 3 years ago

Right now we're calling form and rustfmt to split out the auto-generated modules into many files. I don't think this provides much value as nobody is going to look at those file anyway.

I have a suspicion that we could reduce the crate (download) size a bit by not calling form. For rustfmt on the other hand I am completely unsure whether it helps or not. svd2rust does generate a lot of unnecessary whitespace but I do not know whether rustfmt generates more or less (due to indentation). A rust minifier would of course be the best solution here ;)

Rahix commented 3 years ago

Looks like rustfmt makes files larger. Though if anyone wants to look at the source, a non-formatted file makes it completely impossible while a formatted but large single file (from not calling form) at least leaves some possibility to look at the code.

tones111 commented 3 years ago

Another consideration in support of running the generated code through rustfmt are the documentation src links. I think it's more likely users would click through the source links to investigate the code then opening the file directly. Keeping this well formatted seems worth the space tradeoff. For similar reasons I'd be less excited about a minifier as it makes it harder to inspect.

Another potentially interesting approach I tried implementing a while back was moving the code generation to build-time (ref: https://github.com/stm32-rs/stm32-rs/pull/5). Assuming the majority of the crate size is related to carrying the generated files for all supported processors, this approach only requires the input to the code generation. The tradeoff is that the user would need to generate for their target processor (once) during build.rs. One nice aspect was that it used svd2rust as a library and in specifying it as a build dependency could explicitly pin it to a specific version. This is a win as users could more reliably reproduce a build (since the svd2rust version you have installed and build with isn't specified anywhere).

edit: I found some additional discussion on build-time generation here: https://github.com/stm32-rs/stm32-rs/issues/8

Rahix commented 3 years ago

Another consideration in support of running the generated code through rustfmt are the documentation src links. I think it's more likely users would click through the source links to investigate the code then opening the file directly. Keeping this well formatted seems worth the space tradeoff. For similar reasons I'd be less excited about a minifier as it makes it harder to inspect.

I was thinking people probably won't be looking at the generated sources much but maybe I am wrong about that.

Another potentially interesting approach I tried implementing a while back was moving the code generation to build-time.

That is actually what we were doing in the past, before avr-device was published to crates.io. The problem is that we have some python components in the build process which led to dependency hell for getting things to work. Especially for the AVR stuff I think the setup should be as simple as possible because this target attracts a lot of people with little experience in the embedded systems space (coming from Arduino C/C++).

If all the generation tools were rust though, I think the story would be different.

Anyway, I think for now let's keep it like it is until a complaint about crate size/anything else comes in.

Outurnate commented 2 years ago

Why not try generating more code at build time? atdf2svd, svd2rust, form and svdtools are all rust packages. With the exception of atdf2svd all crates have a lib.rs published, so calling out to them from build.rs should be straightforward. You could remove the Makefile and python dependency

Rahix commented 4 months ago

Reviving an old thread because I've just noticed how bad we have gotten in terms of size. While the crate archive is sitting at 2.6MB, the uncompressed sources have accumulated to a whopping 46MB:

46M     src/devices/
3.5M    src/devices/atmega4809
3.5M    src/devices/atmega4808
3.1M    src/devices/avr64du32
3.0M    src/devices/avr64du28
2.4M    src/devices/attiny1614
2.2M    src/devices/attiny816
1.9M    src/devices/atmega128rfa1
1.8M    src/devices/attiny404
1.7M    src/devices/attiny402
1.4M    src/devices/attiny202
1.4M    src/devices/atmega2560
1.4M    src/devices/atmega1280
1.1M    src/devices/atmega32u4
980K    src/devices/at90usb1286
952K    src/devices/atmega328pb
888K    src/devices/atmega128a
876K    src/devices/atmega64
868K    src/devices/atmega1284p
800K    src/devices/atmega324pa
788K    src/devices/atmega164pa
756K    src/devices/attiny841
748K    src/devices/atmega644
712K    src/devices/atmega8u2
680K    src/devices/atmega88p
680K    src/devices/atmega328p
680K    src/devices/atmega168
668K    src/devices/atmega48p
656K    src/devices/attiny828
644K    src/devices/attiny167
628K    src/devices/atmega16
592K    src/devices/atmega32a
572K    src/devices/attiny88
540K    src/devices/atmega8
520K    src/devices/attiny861
480K    src/devices/attiny84
472K    src/devices/attiny2313a
460K    src/devices/attiny85
448K    src/devices/attiny2313
444K    src/devices/attiny84a
440K    src/devices/attiny44a
296K    src/devices/attiny13a
Rahix commented 4 months ago

Why not try generating more code at build time? atdf2svd, svd2rust, form and svdtools are all rust packages. With the exception of atdf2svd all crates have a lib.rs published, so calling out to them from build.rs should be straightforward. You could remove the Makefile and python dependency

@Outurnate this is an interesting idea. Right now we are still using the old python version of svdtools so I don't know how difficult the switch to the Rust version would be. Then there is also the hacky manual patches we apply on top. And finally, with the current design, we also have codegen in the macro crate. We'd have to get rid of that entirely to make build-time generation work (wouldn't be a bad thing at all, though).

Regarding build-time generation of a peripheral-access-crate, do you know if there is any prior art of another crate doing this? So far I have only seen other crates that also generate before distribution.

Finally, we have to consider the implication of shipping the ATDF sources. I think there shouldn't be a problem with the license, etc. but we have to check.

LuigiPiucco commented 3 months ago

I've been taking a look at this crate since I found a few changes that could help with https://github.com/Rahix/avr-hal/pull/457, and applying the suggestion to generate at build-time seems like it will help and also provide an opportunity to enhance the build-system as a whole. I can send a PR here to do the following:

I also have some changes to the ADC patches specifically, but I'll hold off on those; they rename the fields in a way that makes the split mux5 very obvious.

Finally, we have to consider the implication of shipping the ATDF sources. I think there shouldn't be a problem with the license, etc. but we have to check.

If the LICENSE file in the vendor directory is correct, they are Apache 2.0, which should give us no problems. We'd be distributing unmodified, since we only patch the SVDs, and those are our own derivative work.

And finally, with the current design, we also have codegen in the macro crate.

I think the cortex-m-rt crate changed after interrupts were implemented here, because https://docs.rs/cortex-m-rt/latest/cortex_m_rt/ indicates the macro implementation is not tied to knowing the device's interrupts; that knowledge is provided by the PAC, not the macro crate. If needed, I can update this too, though I assume we can generate the vector list via build.rs as well.

Edit: partial implementation in https://github.com/LuigiPiucco/avr-device/tree/size-opt; it doesn't yet have the build script part. I'll continue, probably tomorrow.

Rahix commented 3 months ago

I think we may also get away with using the standard avr-unknown-gnu-atmega328 target for everything, using a feature-dependent piece in the build script to pass -mmcu to the linker based on the features; This would be particularly helpful over in the HAL, making the numerous JSON specs unneeded, and allowing us to not need a libcore per MCU. I'll try to include a preview of this.

We can't get rid of the custom targets. The AVR microcontrollers each have a slightly different instruction set and the compiler needs to know what instructions it may generate.

LuigiPiucco commented 3 months ago

We can't get rid of the custom targets. The AVR microcontrollers each have a slightly different instruction set and the compiler needs to know what instructions it may generate.

You are correct about the ISA differences, the cpu key in the spec controls this. I thought I had found a way to circumvent that via passing -C target-cpu=<mcu>, but that seems to have slightly different behavior than the spec key, because, despite the Rust code being generated with the proper AVR revision, at the end it tries to link to a symbols.o file that the linker considers incompatible, failing. It works for passing -mmcu, but if we have to use a custom target anyway, that becomes pointless. So I'll drop it, but the other points about the build script still seem valid, I've (force-)pushed their implementation to the same branch as above.

In terms of size, I've run a comparison as follows:

# Package the crate, in this case it needs exactly one MCU feature enabled,
# since it performs a compilation check, but it shouldn't include that into
# the package.
$ cargo publish --dry-run --allow-dirty --features atmega328p,rt

# Compressed package size, I believe this is what the user has to download.
$ du -sh target/package/avr-device-0.5.4.crate
552K    target/package/avr-device-0.5.4.crate

# Uncompressed sources size, without the generated module and binaries.
$ du -sh target/package/avr-device-0.5.4 --exclude target/package/avr-device-0.5.4/target
5.0M    target/package/avr-device-0.5.4

# The generated module for atmega328p.
$ du -ah target/package/avr-device-0.5.4/target/avr-unknown-gnu-atmega328p/debug/build/avr-device-<some-hash>/out/pac
20K     target/package/avr-device-0.5.4/target/avr-unknown-gnu-atmega328p/debug/build/avr-device-<some-hash>/out/pac/generic.rs
528K    target/package/avr-device-0.5.4/target/avr-unknown-gnu-atmega328p/debug/build/avr-device-<some-hash>/out/pac/atmega328p.rs
552K    target/package/avr-device-0.5.4/target/avr-unknown-gnu-atmega328p/debug/build/avr-device-<some-hash>/out/pac/

Although there are probably many things to review since this is a pretty big change, I think it's worth investing some time into it. ~I'll open a PR later.~ It's in #157.