Expose target cpu to conditional compilation

dylanmckay commented 7 years ago

I've recently created a crate that provides IO/microcontroller specific constants to Rust programs written for the AVR architecture.

avrd on GitHub avrd on Docs.rs

I've made it so that there is always a current module in the crate root that reexports all items from a microcontroller-specific crate, depending on what CPU is being targeted.

In order to achieve this, I needed to add a target_cpu conditional compilation flag to my fork of the Rust compiler

https://github.com/avr-rust/rust/commit/1f5f6bd8f85e905ffb5da6389e879277f7f6b708

I would like to upstream this (with cleanups like more documentation), as it seems generally useful, but it also would make it easier to merge the fork into mainline if that were to ever happen in the future. On top of this, it would be quite a useful feature to other people who fork Rust for more esoteric architectures.

The in-tree targets don't seem to specify CPUs most of the time - looking in the librustc_back module, it looks like some targets set CPU to cortex-a8, some to a CPU name used by Sparc, but almost everything else is left blank, generic, or "x86_64", etc.

Before I file a PR, are there any glaring objections to this?

hanna-kruppe commented 7 years ago

As far as I am aware, for most targets the "CPU" setting is just a shorthand for a rough baseline of features (e.g., whether certain ISA extensions are present) and as guidance for certain heuristics in the compiler (e.g., scheduling). Anything sane that user code might do with that information can more reliably be achieved with cfg(target_feature=...) or run-time CPUID poking. I don't know anything about AVR, so maybe the situation is very different there, but I'd be very suspicious of any code that does cfg(target_cpu="skylake") or similar.

hanna-kruppe commented 7 years ago

cc @japaric

japaric commented 7 years ago

@rkruppe Thanks for the ping.

@dylanmckay I have written my thoughts (more or less) on the topic of device specific APIs, which seems to be what you are after with this change, here. You may want to check that out. The TL;DR is that we haven't picked up a solution but that comment lists some of the possible solutions.

With respect to the issue at hand, I don't think target_cpu is the right field to use here. I don't know much about AVR microcontrollers but on ARM Cortex-M microcontrollers there's pretty much only a few CPUs, with slightly different instruction sets: cortex-m0(+), cortex-m1, cortex-m3, cortex-m4, cortex-m7 -- here, the instruction set of each CPU has more instructions than the preceding one. A Cortex-M microcontroller, or a Cortex-M device, is then just one of these CPU core with varying amounts of memory and peripherals.

It seems to me that what you want here is actually a target_device field, where the value of the field could, for example, be "stm32f100" or "stm32f101" -- here both devices have the same CPU: cortex-m3 and thus the exact same instruction set and codegen options. I, personally, don't know if we should add a target_device field to target specifications as none of the tier 1 and 2 platforms seem like they would benefit from it, but I believe that we shouldn't encourage the use of target_cpu to specify the target device as it has meaning to LLVM (it affects codegen, etc.).

To me, it also feels overkill to have one target (specification) per device / microcontroller -- I'm assuming this is what you are after: one target per device where the specification is mostly the same except for target_device. Under the Xargo (or std-aware Cargo) model this would create one sysroot per target, which is wasteful: if you are working with two devices that have the same CPU then you'll end compiling and caching two sysroots that are exactly the same because the codegen options of each device are the same (and core doesn't use target_device or target_cpu so the same source code is compiled in each case).

Finally, what you want to achieve here can be done without modifying the compiler, and thus you can experiment today, and without having one target per device: you can append the --cfg target_device=.. flag to rustc using .cargo/config (or by setting the RUSTFLAGS env variable):

[target.thumbv7m-none-eabi]
rustflags = ["--cfg", "target_device=stm32f100"]

[build]
target = "thumbv7m-none-eabi"

This setting will affect the conditional compilation (#[cfg(target_device = "..")]) of the whole dependency graph.

Messing with the .cargo/config may seem onerous but, to me, it doesn't seem that different from downloading a target specification file and passing --target $device to each Cargo/Xargo invocation. Also, you can always hide this modification using a Cargo subcommand, for example cargo set-target stm32f100, or a Cargo wrapper like bobbin.

Just my two cents :-).

dylanmckay commented 7 years ago

I believe that AVR is special compared to ARM in that is is not enough to know what peripherals are supported - we also need to know where they are exposed in memory via IO registers.

For example, PORTB is an IO register that can be written to that will set one or more of the 8 pins to high or low. Not all microcontrollers have PORTB, and there is no particular pattern on the address of the register in the register file across devices.

Here is a sample of the generated microcontroller files that contain PORTB

attiny84.rs:pub const PORTB: *mut u8 = 0x38 as *mut u8;
attiny840.rs:pub const PORTB: *mut u8 = 0x25 as *mut u8;
attiny841.rs:pub const PORTB: *mut u8 = 0x38 as *mut u8;
attiny84a.rs:pub const PORTB: *mut u8 = 0x38 as *mut u8;
attiny85.rs:pub const PORTB: *mut u8 = 0x38 as *mut u8;
attiny861.rs:pub const PORTB: *mut u8 = 0x38 as *mut u8;
attiny861a.rs:pub const PORTB: *mut u8 = 0x38 as *mut u8;
attiny87.rs:pub const PORTB: *mut u8 = 0x25 as *mut u8;
attiny88.rs:pub const PORTB: *mut u8 = 0x25 as *mut u8;
attiny9.rs:pub const PORTB: *mut u8 = 0x2 as *mut u8;

Note that these devices are in the same family (Tiny), and yet even then we have IO registers in different locations.

Because of this, we have to use conditional compilation per microcontroller.

The "pack" files (analguous to SVD AFAIK) provided by Atmel (see here) do no attempt at grouping into families. All registers, their locations, the supported peripherals, the supported interrupts, are all specified in whole for every individual device.

AVR-GCC also uses the same methodology, having header files per device rather than family (for example, iotn48.h)

I don't think target_cpu is the right field to use here

Thinking about it your way, I completely agree with this here. The different microcontrollers themselves may have the same CPU, but the difference is the locations of the IO registers that the CPU itself uses. In this regard, using target_cpu in this regard is abusing what it actually means.

AVR-GCC actually does something similar to what you suggest with target_device, but it uses the name mcu instead, which is separate to GCC's concept of a CPU (documentation).

When I think about this more, I'm fairly certain that the AVR-LLVM backend (which I maintain) does use the CPU to distinguish different microcontrollers. This is because LLVM provides the a consistent interface for the creation of targets (AVRMCTargetDesc.cpp:51, this function must have this signature to compile).

In order to separate the concept of CPU and device in LLVM, it would require changing LLVM itself, and even then I don't think that Atmel themselves make a distinction between CPUs as I've never seen it in a datasheet (ATmega328 wiki). I believe that because we don't know what CPU is actually in the package, the only thing we can do is use the device instead.

Because of this, I think that our only choice is to treat different microcontrollers as different CPUs.

if you are working with two devices that have the same CPU then you'll end compiling and caching two sysroots

I had not considered this. I'm not too worried about this personally because AVR at the moment only supports libcore, which isn't too onerous to compile once per device.

Finally, what you want to achieve here can be done without modifying the compiler

That's true, we can definitely achieve the same thing without modifying the compiler, but because the information is already there in the target specification, I don't want my downstream users to have to worry passing the MCU again in a separate argument it as they don't really need to.

dylanmckay commented 7 years ago

I've looked at AVR-GCC a little bit more, and I believe that it doesn't actually make a distinction between CPU and MCU.

It looks like GCC does not provide any target-independent arguments to set the CPU - if a target wants to support something like this, it needs to create a target-specific argument.

Note that X86 provides its own -mtune argument, whereas AVR provides -mmcu.

AVR does not support specifying a CPU, only an MCU. In that regard, AVR gcc does the same thing my patch currently does in that you only ever specify one.

dylanmckay commented 7 years ago

If we come to a consensus and find that for AVR, we do need to support conditional compilation based on CPU, then it might be a good idea to make it a flag that is only turned on for backends that opt-in.

On top of this, a lint could be written to warn about usage of target-specific attributes that aren't guarded by a #[cfg_attr(arch = "avr", cfg(target_cpu = "atmega328"))].

If we decided that though, there'd be no point merging it until we actually have an in-tree backend that opts in to the feature, so I'd be tempted to hold off until the day the fork converges into mainline.

hanna-kruppe commented 7 years ago

I find it a bit surprising that the LLVM backend apparently needs detailed information about the device. Does each device have a subtly different ISA? Does the backend need to know about peripherals? Or why do you want the LLVM backend to know when it's targeting, for example, "atmega328"?

dylanmckay commented 7 years ago

Does each device have a subtly different ISA?

tl;dr Sometimes, yes

In general, all devices can be more or less grouped into families.

For example, avr1, avr2, avr25, ... avr5, avr51, are all family names. These families each define an subset of the full AVR ISA.

You can find the AVR backend's definition of these families in AVRDevices.td. You can see that in general, the families build up on top of each other in terms of supported instructions/features.

The problem is that not every realised ISA can be cleanly separated into a family.

For example, the ATtiny26 is a part of the avr2 family, but it also happens to support the "LPMX" set of instructions

Another example is the ATMega8515, which almost implements all of avr4, but not quite all, and so the linked definition bases the device off avr2 and adds the extra features explicitly.

Does the backend need to know about peripherals?

No

Or why do you want the LLVM backend to know when it's targeting, for example, "atmega328"?

Solely for deciding what subset of the ISA is supported.All device-specific information required (or used) by the backend can be found inside AVRDevices.td.

hanna-kruppe commented 7 years ago

Thanks for the explanation. I can't shake the feeling that it might be cleaner for the backend to just know about the various ISA subsets/features, and target specs for different devices setting the right features (as opposed to the backend knowing about all devices out there and which features they imply). But since I know nothing of the topic other than what you've written down here, and this implementation has been accepted upstream, I'll chalk it up to my personal ignorance or taste.

dylanmckay commented 7 years ago

When you say "target specs", do you mean on the Rust side? I worry that would cause every single AVR LLVM frontend to have to know about every single AVR, and when new ones come up we would need to add the MCUs to N frontends.

At the moment, the device families (avr1, avr2, ..) and the specific devices (atmega328p, ..) under the hood are actually the same thing. That means that you can use "cpu": "avr5". On top of this, any frontend can set attributes when compiling, and all possible features (lpmx, sram, mul) features can be enabled that way.

That means that at the moment, it is possible for a frontend to specify families and support features if they desire.

gnzlbg commented 5 years ago

Is there a PR or an RFC for this ? What remains to be done ?

I have a proc macro that should have different logic if #[cfg(target_cpu = "native")] and could benefit from this.

Having to scan RUSTFLAGS, .cargo/config, ... to figure out whether -C target-cpu=native is somehow passed to the compiler is a pain.

shepmaster commented 5 years ago

I don't believe there's anything beyond this issue (and the living implementation of it over in avr-rust). I don't want to speak too much for @dylanmckay, but I believe they've been pretty busy with other things recently and might not have the bandwidth to drive an RFC through.

gnzlbg commented 5 years ago

We could build a library that can be used in build.rs to detect the target-cpu, if any, and appropriately defines cfg flags for it, that we can do conditional compilation on. This library would need to query whether target-cpu is set anywhere relevant (RUSTFLAGS, cargo config, etc.). This would be annoying, but maybe doable, and it'll save us the hassle of having to go through the RFC process "just for this". If that turns out to be too hard to do correctly, we can revisit this. @dylanmckay @japaric what do you think about that ?

novacrazy commented 5 years ago

Is there any progress on this feature? Seems odd this is still missing after two years.

I'm running into some situations where specialized instructions on AMD are slower than their naive implementations. e.g. _mm_dp_ps is a bit slower than a naive multiplication and horizontal add reduction.

Ideally, I could just exclude target_cpu="znver1" using cfg_if! or similar, falling back to the naive implementation.

novacrazy commented 4 years ago

Coming up on the 3rd anniversary of this. Has there been any progress on this issue?

rust-lang / rust

Expose target cpu to conditional compilation #44036