ardaku / whoami

Rust crate to get the current user and environment.
Apache License 2.0
195 stars 31 forks source link

feat: arch() #50

Closed SteveLauC closed 1 year ago

SteveLauC commented 1 year ago

What this PR does

  1. Add enum Arch to represent the CPU architecture (constructed by arch() function)
  2. Add enum Width to represent the word width
  3. Add function Arch::arch_width(&self) -> Width to get the width of a specific architecture

Architectures supported by each OS and Rust toolchain:

The first field of Rust toolchain triple on Linux/FreeBSD/DragonFly/NetBSD/OpenBSD/Windows/Wasm

Linux

  1. aarch64

  2. arm

  3. armv4t

  4. armv5te

  5. armv7

  6. thumbv7neon

  7. armeb

  8. i586

  9. i686

  10. x86_64

  11. mips

  12. mipsel

  13. mips64

  14. mips64el

  15. mipsisa32r6

  16. mipsisa32r6el

  17. powerpc

  18. powerpc64

  19. powerpc64le

  20. sparc

  21. sparc64

  22. hexagon

  23. m68k

  24. riscv32gc

  25. riscv64gc

  26. s390x

macOS

  1. aarch64

  2. i686

  3. x86_64

FreeBSD

  1. aarch64

  2. armv6

  3. armv7

  4. i686

  5. x86_64

  6. powerpc

  7. powerpc64

  8. powerpc64le

  9. riscv64gc

DragonflyBSD

  1. x86_64

NetBSD

  1. aarch64

  2. armv6

  3. armv7

  4. i686

  5. x86_64

  6. powerpc

  7. sparc64

OpenBSD

  1. aarch64

  2. i686

  3. x86_64

  4. powerpc

  5. powerpc64

  6. riscv64gc

  7. sparc64

Windows

  1. i586

  2. i686

  3. x86_64

  4. aarch64

  5. thumbv7a

Wasm

  1. wasm32
  2. wasm64

Known Problems

  1. Is m68k 16 bits or 32 bits?

    In my impl, I treat it as a 32-bit arch.

    This chip is 32 bits internally and 16 bits externally.

    Ref:

    1. wikipedia: Motorola 68000
    2. Is the 68000 unfairly labeled a 16-bit CPU?
  2. Should asm.js be included? What is the width of this arch?

  3. Rust does not have i386-xxxxxx toolchains for the OSes we are going to support

    The only target starts with i386 is i386-apple-ios.

    But this arch is still added in my impl as uname -m could return this value on Linux

  4. Some variants of Arch may never be constructed, for example:

    1. ArmEb
    2. MipsIsa32R6,
    3. MipsIsa32R6El,
    4. ThumbV7A,
    5. ThumbV7Neon,
    6. ...

    I am not sure if uname -m will return such detailed architectures, for example, for arch MipsIsa32R6, uname -m may just return mips? If so, then MipsIsa32R6El is never used.

AldaronLau commented 1 year ago

@SteveLauC Thanks for doing this, this is awesome!


Is m68k 16 bits or 32 bits?

Considering that the Wiki link says:

The 68000 has a 24-bit external address bus and two byte-select signals "replaced" A0. These 24 lines can therefore address 16 MB of physical memory with byte resolution.

Rust should make usize 32-bits rather than 16, so it should be treated as a 32-bit arch.


Should asm.js be included? What is the width of this arch?

You can try compiling a program to asm.js that exports a function which returns the result of core::mem::size_of::<usize>() and call it from JavaScript, printing to the console to find out. You could treat it as unknown for now, though, if you don't want to go through doing all that.


Rust does not have i386-xxxxxx toolchains for the OSes we are going to support

Rust could add support at any time for currently unsupported architectures, so we should try to "support" them even though it can't be tested.


Some variants of Arch may never be constructed

I will have to think more about that, it should be fine to leave it for now but some testing with QEMU might be a future task to look at.

SteveLauC commented 1 year ago

Should asm.js be included? What is the width of this arch?

You can try compiling a program to asm.js that exports a function which returns the result of core::mem::size_of::<usize>() and call it from JavaScript, printing to the console to find out. You could treat it as unknown for now, though, if you don't want to go through doing all that.

I just gave it a try:

// main.rs

fn main() {
    println!("The size of <usize>: {} (in bytes)", core::mem::size_of::<usize>());
}
# env

$ emcc --version
emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.26 (8eaf19f1c6a9a1b0cd0f9a91657366829e34ae5c)

$ node --version
v14.18.2

$ /usr/bin/node --version
v18.1.0

# Chrome Version:
# Version 107.0.5304.110 (Official Build) unknown (64-bit)
# Test through nodejs

$ cargo b --target asmjs-unknown-emscripten

$ node target/asmjs-unknown-emscripten/debug/rust.js
The size of <usize>: 4 (in bytes)

$ /usr/bin/node target/asmjs-unknown-emscripten/debug/rust.js
The size of <usize>: 4 (in bytes)
# Test through Chrome

$ cat rust.html
<html>
    <head>
        <script src="rust.js"></script>
    </head>
</html>

# Open `rust.html` in Chrome
# Open console to see the output

Screenshot from 2022-11-26 07-28-43

All the platforms (nodejs14/nodejs18/Chrome) report 4 bytes, seems like we should add a variant Bits16 to Width.


Some variants of Arch may never be constructed

I will have to think more about that, it should be fine to leave it for now but some testing with QEMU might be a future task to look at.

Yeah, this is also my concern. We should add some tests for this API. Currently, it is only tested on x86_64-unknown-linux-gnu, let me try some other UNIX platforms through the CI of my repo:

  1. x86_64-unknown-freebsd-12
  2. x86_64-unknown-freebsd-13
  3. aarch64-apple-darwin
AldaronLau commented 1 year ago

All the platforms (nodejs14/nodejs18/Chrome) report 4 bytes, seems like we should add a variant Bits16 to Width.

4 bytes would be 32 bits, so should correspond to Bits32

SteveLauC commented 1 year ago

All the platforms (nodejs14/nodejs18/Chrome) report 4 bytes, seems like we should

add a variant Bits16 to Width.

4 bytes would be 32 bits, so should correspond to Bits32

God, this is awkward, sorry about this, I am pretty dizzy now🥲

AldaronLau commented 1 year ago

All the platforms (nodejs14/nodejs18/Chrome) report 4 bytes, seems like we should

add a variant Bits16 to Width. 4 bytes would be 32 bits, so should correspond to Bits32

God, this is awkward, sorry about this, I am pretty dizzy now

It's all good, no worries!

SteveLauC commented 1 year ago

I just got every suggestion applied, but currently, it is still not fine to get this merged. This PR is tested and worked on:

But on FreeBSD, it does not:

use whoami::{arch, Arch};

fn whoami_test() {
    println!("whoami_test:");
    println!("std::env::consts::ARCH: {}", ARCH);
    let arch = arch();
    println!("{:?}", arch);
    println!("{:?}", arch.width());
    println!();
}

fn main() {
    whoami_test();
}
$ uname -m
amd64

$ cargo r -q
whoami_test:
std::env::consts::ARCH: x86_64
Unknown("")
Err(Custom { kind: InvalidData, error: "Calling width() on an unknown arch () is invalid" })

I don't have a FreeBSD VM set up and can't configure one currently, so I have to use cirrus CI to get this tested. I will add some debugging commits until I ascertain what is wrong with the FreeBSD implementation, no API changes, just some dbg!() statements, so you don't need to check it out (if you are satisfied with the current implementation).

Once this is done, I will ping you to get the final review:)

SteveLauC commented 1 year ago

This is kinda weird, it just worked:

// whoami/unix.rs

pub fn arch() -> Arch {
    let mut buf = UtsName::default();
    let result = unsafe { uname(&mut buf as *mut UtsName) };
    if result == -1 {
        return Arch::Unknown("uname(2) failed to execute".to_owned());
    }

    let arch_str = unsafe { CStr::from_ptr(buf.machine.as_ptr()) }
       .to_str()
       .unwrap();
    println!("DBG: arch_str  {}", arch_str);
    println!("DBG: arch_str bytes {:?}", arch_str.as_bytes());

    Arch::from_str(arch_str)
}
// test/main.rs

fn whoami_test() {
    println!("whoami_test:");
    println!("std::env::consts::ARCH: {}", ARCH);
    let arch = arch();
    println!("arch: {:?}", arch);
    println!("width: {:?}", arch.width());
    println!();
}

fn main() {
    whoami_test();
}
$ uname -m
amd64

$ cargo r -q
whoami_test:
std::env::consts::ARCH: x86_64
DBG: CStr "amd64"
DBG: CStr bytes [97, 109, 100, 54, 52]
arch: X86_64
width: Ok(Bits64)

Platforms that have been tested:


I just squashed commits into a single one, I think it is ready for the final review:)

SteveLauC commented 1 year ago

Changes requested are done and testing passed:)

SteveLauC commented 1 year ago

Changes on README.md/lib.rs/CHANGELOG.md/Cargo.toml are made to make it ready for release!