afranchuk commented 1 year ago

procfs only compiles on linux and android. For our use case, we only use procfs for parsing files which were previously stored; there's no need to limit platforms in this case. We'd like to be able to use this subset of cross-platform-capable features, but currently cannot.

I began some changes to allow support on other platforms, however there are some design decisions for which I wanted feedback. FWIW, there are only a handful of things that aren't macos-compatible currently. There are more things that aren't windows-compatible, though mostly it's anything using rustix::fs, otherwise most of the incompatibilities have known fixes (e.g. OsStr conversions). It might actually be feasible to only exclude these (rather than wholesale excluding most stuff).

There are a few ways to achieve this. Below I enumerate them, where the term "local-procfs" is used to designate systems which (should) have a procfs mounted. It could just as easily be "native-procfs" or something else.

Options

Change build.rs to provide a cfg(local-procfs) or an equivalent feature flag rather than failing the build.
Add a feature flag for local-procfs, remove build.rs, and enforce with a compile_error! that it isn't used on unsupported platforms. TBH the crate failing to compile otherwise would basically be error enough, though having a better error message is nice.
Separate the procfs type definitions and basic parsers to another crate.

Notes

(1) is a bit inflexible only because some people might want to use the crate only for cross-platform purposes, in which case there's no reason to compile all the other stuff when targeting linux/android.
(1) and (2) are both a bit cumbersome to implement, because unfortunately there's no way to tell rustc to compile only stuff with a particular cfg designation, so you have to go through all existing code and tag it (and remember to do so for future code, too). This could be eased significantly by separating the code into distinct modules (so that you can simply include/exclude whole sets of implementations), but that separation would come at the cost of splitting many existing files in two (either in the filesystem or within the files with mods).
(3) is essentially similar to the end of the last note (making distinct modules for the support), though it's a bit more formal in separation.

If this isn't something that's desirable for the procfs crate, we may have to use something else (though perhaps we can work around it by changing how some other tools store data). However we do have a decent amount of things to read from procfs files, so it'd be a shame to not be able to utilize the code you've created.

eminence commented 1 year ago

Let me think about this some more, but the first idea that comes to mind is basically to make the functions that read /proc/ conditional to linux/android, and leave the other functions available on all platforms. Using LoadAverage as an example, the LoadAverage::new() function would be available only on linux/android, but LoadAverage::from_reader(impl Read) would be available on all platforms.

I guess this is like your option (3), but it keeps everything in the same crate. My idea might fail the "principle of least astonishment" however, by having a fairly different API on different platforms.

I have no objection to splitting out a "procfs-parsing" or "procfs-core" that is pure parsing (no IO), but I'm wondering if some carefully sprinkled #[cfg(unix)] attrs might be easier

At this point, you've spent more time trying this than I have (which is none :grin:), so let me know what you think

afranchuk commented 1 year ago

I had two concerns with that approach:

that it might be a lot of work to determine and/or separate those parts out (perhaps with more problem areas than is obvious), since reading file paths are logical errors rather than compilation errors, and
that people adding new code would forget to add such annotations (though hopefully CI could catch that).

That being said, in my cursory attempts to try compilation for macos and windows, there were only a few things that were hard-stop compilation errors. Obviously the logical runtime errors with reading /proc files when they don't exist should be nipped earlier at compile time for user convenience, but even if they weren't, such errors wouldn't be that bad --- it's okay to assume users know what procfs is in the first place! I.e., they shouldn't expect such functions to work on systems without procfs mounted somewhere.

A feature flag may be more desirable than detecting the OS for users who don't want to bother compiling code they don't want. That's where having a separate crate might be more direct, though splitting the crate in two will be more work.

I guess my concerns aren't that strong since at least the first one will apply to any solution here: a separation has to occur, so no matter what the organization approach taken, all the code has to be read and analyzed to determine whether it is platform-agnostic or not. :shrug:

My idea might fail the "principle of least astonishment" however, by having a fairly different API on different platforms.

I think having a different API on different platforms is okay; there are many crates that work this way, and people who target multiple platforms are probably used to that possibility. But it does make it clearer when such functional differences are separated by crate boundaries.

One advantage to putting the extra effort into creating a separate crate is that the two crates would each have a well-defined purpose. One for parsing, the other for accessing the running system. But maybe "accessing the running system" is such a thin/boilerplate layer that a separate crate is unnecessary (not that there's really much overhead there).

eminence commented 1 year ago

I think we can do this without a massive amount of work, at least in the sort term, by just not compiling all the stuff that is tricky

A few things I've noticed so far:

When parsing the Stat structure, we look at the current kernel to figure out if we should read a field or not. Since new fields in new kernels are only added at the end of the line, we can stop reading the line as soon as we find the first missing field. We don't actually need to know the current kernel
Parsing the CpuTime struct requires knowing the current ticks-per-second in order to convert ticks into second. We can just skip the functions that use the tps

I've just opened a draft PR for this.

I think it would be worth opening another draft PR that shows how the parsing code might be pulled out into a separate crate. It doesn't have to move everything. Maybe we can just focus on the stuff you care about for rust-minidump?

tatref commented 1 year ago

I might be interested in this too!

I created a tool (see https://github.com/tatref/linux-mem/tree/master/proc_snap) some time ago to dump /proc, at the moment it dumps mainly /proc/<pid>. For example it will create a sparse file for /proc/kpageflags or /proc/<pid>/pagemap

I'll give a try to your PR on Windows

tatref commented 1 year ago

Could we provide sensible defaults for values like ticks_per_second or page_size? Or allow the user to configure these values?

Or wrap everything into a Procfs struct that will reference the custom path? With the added benefit that we could parse multiple values without specifying the custom path for each call

let my_procfs = procfs::from_root("/some/path/proc")
    .with_ticks_per_second(100)
    .build():
my_procfs.kernel_stats.new();

I tried to transfer sparse files from Linux to Windows, but this is proving more difficult than anticipated

afranchuk commented 1 year ago

I'd think if they were retained in the API at all, they should be configurable and required inputs (it would be up to the user to store them when they store procfs).

afranchuk commented 1 year ago

@eminence would you like me to take a stab at that draft PR which splits things into two crates? And then you can decide which approach to take?

eminence commented 1 year ago

If you're up for it, I would be grateful. Even just the bare minimum (converting the repo to a workspace, creating a new procfs-core create and adding in just 1 parsing function to it) would be valuable. If not, not worries at all, I'll start a draft of this on the upcoming weekend.

afranchuk commented 1 year ago

257 covers this.

afranchuk commented 1 year ago

@eminence any plans for tagging a new version (which would be the initial pubilsh of procfs-core)?

eminence commented 1 year ago

There are some documentation additions I want to make before a release. I'm not sure how long that will take. But if it's useful to you, I can immediately release a "RC1" version of both procfs and procfs-core

tatref commented 1 year ago

@afranchuk in the meantime, you can use this in your Cargo.toml:

procfs = { git = "https://github.com/eminence/procfs" }

It is possible to specify a branch with branch = "name" or a specific commit with rev = "abcdef"

eminence / procfs

Support (limited) use on other platforms #255

Options

Notes

257 covers this.