kawamuray / linux-taskstats-rs

Rust interface for Linux taskstats
MIT License
9 stars 7 forks source link

Library panics when copying a slice #1

Closed bvaisvil closed 2 years ago

bvaisvil commented 2 years ago

Hiya,

Thanks for the work on this library. I'm looking to use it for my project, zenith.

It appears that the library panics here:

impl From<&[u8]> for TaskStats {
    fn from(buf: &[u8]) -> Self {
        let mut inner_buf = [0u8; TASKSTATS_SIZE];
        inner_buf.copy_from_slice(buf); // <---

This happens either using the library directly or using your 'taskstats' program. I added some additional logging that confirmed the culprit is the statement I highlighted above.

[DEBUG linux_taskstats::netlink] Sending nl cmd: type=30, genl_cmd=1, nla_type=1 nla_data.len=4 [DEBUG linux_taskstats::netlink] Sending msg of size=28 [DEBUG linux_taskstats::netlink] Received msg: size=380, type=30, nlmsg_len=380 [DEBUG linux_taskstats] Received TASKSTATS_TYPE_PID [DEBUG linux_taskstats::model] TASKSTATS_SIZE: 328 [DEBUG linux_taskstats::model] COPYING [ERROR zenith] thread '<unnamed>' panicked at 'source slice length (344) does not match destination slice length (328)', /rustc/c8dfcfe046a7680554bf4eb612bad840e7631c4b/library/core/src/slice/mod.rs:3058:13

Adding a u128 padding to the end of taskstats struct makes it the right size. The code then compiles and works enough I don't have any crashes. But I haven't verified the values are still correct.

pub struct taskstats {
    pub version: __u16,
    pub ac_exitcode: __u32,
    pub ac_flag: __u8,
    pub ac_nice: __u8,
    pub cpu_count: __u64,
    pub cpu_delay_total: __u64,
    pub blkio_count: __u64,
    pub blkio_delay_total: __u64,
    pub swapin_count: __u64,
    pub swapin_delay_total: __u64,
    pub cpu_run_real_total: __u64,
    pub cpu_run_virtual_total: __u64,
    pub ac_comm: [::std::os::raw::c_char; 32usize],
    pub ac_sched: __u8,
    pub ac_pad: [__u8; 3usize],
    pub __unused_padding: u32,
    pub ac_uid: __u32,
    pub ac_gid: __u32,
    pub ac_pid: __u32,
    pub ac_ppid: __u32,
    pub ac_btime: __u32,
    pub ac_etime: __u64,
    pub ac_utime: __u64,
    pub ac_stime: __u64,
    pub ac_minflt: __u64,
    pub ac_majflt: __u64,
    pub coremem: __u64,
    pub virtmem: __u64,
    pub hiwater_rss: __u64,
    pub hiwater_vm: __u64,
    pub read_char: __u64,
    pub write_char: __u64,
    pub read_syscalls: __u64,
    pub write_syscalls: __u64,
    pub read_bytes: __u64,
    pub write_bytes: __u64,
    pub cancelled_write_bytes: __u64,
    pub nvcsw: __u64,
    pub nivcsw: __u64,
    pub ac_utimescaled: __u64,
    pub ac_stimescaled: __u64,
    pub cpu_scaled_run_real_total: __u64,
    pub freepages_count: __u64,
    pub freepages_delay_total: __u64,
    pub __unused_padding2: u128,
}

Does the taskstats struct vary sizes significantly between kernels? What would be the best way to solve this issue?

Thanks!
kawamuray commented 2 years ago

Hi bvaisvil,

thanks for reporting the issue, and sorry for delay in response. I was in vacation.

Does the taskstats struct vary sizes significantly between kernels? What would be the best way to solve this issue?

so yeah, the reason I decided to use hardcoded copy of struct taskstats is explained in this comment: https://github.com/kawamuray/linux-taskstats-rs/blob/master/src/lib.rs#L145

Ideally we could just use the bindgen generated taskstats struct so that it will be consistent with the struct defined in building environment, making it portable among different versions of linux.

As written in the comment, I was expecting the definition of struct taskstats does not change often, but apparnetly at least few fields has already added in upstream since then: https://github.com/torvalds/linux/blob/5bfc75d92efd494db37f5c4c173d3639d4772966/include/uapi/linux/taskstats.h#L170 :(

but seeing this situation, I'm start thinking that I should rather prefer to use the generated struct even though the old clang issue hasn't yet solved.

btw your project - zenith - looks really cool 👍

bvaisvil commented 2 years ago

No worries, I hope you had a great vacation! Thanks for the kind comments about zenith!

I ended up working around the issue by only copying part of the slice:

inner_buf.copy_from_slice(&buf[..TASKSTATS_SIZE]);

This compiled on all the kernels I tried (various 4s and 5s). Now, if someone tried to compile on a kernel with a smaller taskstats this wouldn't work and any new fields wouldn't be present.

Since my primary interest was the io_delay and swap_delay fields this worked great for me.

Screen Shot 2021-09-28 at 10 52 55 AM

If this is acceptable I could do a PR with this solution (at least until you have time to add bindgen) ...

Thanks again for the work on the library!

kawamuray commented 2 years ago

only copying part of the slice:

Yeah that's what I tried too, and it works indeed.

Interestingly even though I use the platform specific definition of taskstats struct, size still doesn't match. I looked into kernel source code briefly and turns out that it's likely due to memory alignment, (352 bytes vs 328 bytes (struct size), where 352 is 328 + 32(nlattr header size) aligned to 64 = 384 minus 32 (nlattr header size), still not 100% sure tho.

Anyway, since most of the interested fields are already defined in the existing taskstats struct, I think it's reasonable to workaround this issue by limiting source slice size like you did. Looking forward a PR from you then :)

kawamuray commented 2 years ago

Fixed by #2