rust-unofficial / patterns

A catalogue of Rust design patterns, anti-patterns and idioms
https://rust-unofficial.github.io/patterns/
Mozilla Public License 2.0
7.87k stars 357 forks source link

Revise the FFI chapters #190

Closed Michael-F-Bryan closed 3 years ago

Michael-F-Bryan commented 3 years ago

This is a continuation of #106 which addresses some of the items pointed out in the review.

Fixes #171.

Michael-F-Bryan commented 3 years ago

@jhwgh1968 how do you think I should approach the CString example in the "accepting strings" chapter?

I feel like the example is too contrived to show how you might use CString. It essentially reduces down to

fn mylib_log(msg: &str) {
  let msg: String = msg.to_owned();
  crate::log(&msg, ...);
}

... which isn't really code that people would write.

From my experience you don't really use CString when accepting null-terminated strings. Normally you'd keep them around as a &CStr or convert to a String with something like CStr::from_ptr(...).to_str().unwrap().to_owned().

The only example I can think of is when you need to store the string somewhere and keep it around after the function ends, but even then I'd use CStr::from_raw(...).to_owned() instead of copying to a Vec and calling CString::new(...).unwrap().

The to_owned() route also neatly side-steps the issue of not copying the null-terminator across.

Michael-F-Bryan commented 3 years ago

There are actually two orthogonal concepts that we'll want to address in the error handling chapter. One is returning the equivalent of Rust's Result<T, E>, and the other is the myriad of different conventions that C uses. The current chapter only handles Result<(), E> and doesn't talk about how you return both an error code and a value.

For example, how would write FFI bindings for something like this?

fn parse_elf(filename: *const c_str) -> Result<ElfBinary, ParseError>;

struct ElfBinary { ... }
enum ParseError { ... }

Some possible solutions are:

typedef struct ElfBinary ElfBinary;

// null on parse failed
ElfBinary *parse_elf(const char *filename);

// out parameters and result enum
typedef enum ParseResult {
  OK = 0,
  ...
} ParseResult;
ParseResult parse_elf(const char *filename, ElfBinary **parsed);

// out parameters and boolean return
boolean parse_elf(const char *filename, ElfBinary **parsed);

// errno-style and null
extern int errno;
const char *last_error_msg();
ElfBinary *parse_elf(const char *filename);

// everything is out parameters
typedef enum ParseResult {
  OK = 0,
  ...
} ParseResult;
void parse_elf(const char *filename, ElfBinary **parsed, ParseResult *result);
jhwgh1968 commented 3 years ago

how do you think I should approach the CString example in the "accepting strings" chapter?

I feel like the example is too contrived to show how you might use CString.

It is a little contrived, but I could not do a good translation of what I had.

My goal with that alternative was to illustrate a thought process I actually had in the beginning: always copy all pointers to a Rust-owned value ASAP, and then never touch the pointers again, no matter what type they were. That is what I wanted to discourage.

The only example I can think of is when you need to store the string somewhere and keep it around after the function ends, but even then I'd use CStr::from_raw(...).to_owned() instead of copying to a Vec and calling CString::new(...).unwrap().

I had code that did a bunch of pointer arithmetic, so I put it in. But that was because I had C code which did that, and it stayed around for quite some time before I completely rewrote it. I agree that most Rust programmers wouldn't think of something that backwards, and it might not be worth putting in.

In fact, your snippet seems like a good thing to replace my alternative with: "if you really want an owned one, use this. Don't do pointer arithmetic or null byte termination yourself, because it's error prone, and this takes care of it."

There are actually two orthogonal concepts that we'll want to address in the error handling chapter. One is returning the equivalent of Rust's Result<T, E>, and the other is the myriad of different conventions that C uses. The current chapter only handles Result<(), E> and doesn't talk about how you return both an error code and a value.

I tried to suggest (and wasn't very clear) that the "value" was what was returned by db_error_description. I forgot to put in a db_parse_error_details function to match it, which would return a parse_error struct by value in order to match it.

For example, how would write FFI bindings for something like this?

fn parse_elf(filename: *const c_str) -> Result<ElfBinary, ParseError>;

struct ElfBinary { ... }
enum ParseError { ... }

The pattern I wrote would say one of two things: if it's opaque, return it as a pointer. If it's transparent, use an out parameter and an integer code (either non/zero, or an error code number).

/* elfbin.h */
// if transparent
int elf_bin_new(const char* filename, **ElfBinary out_elf);

// if opaque, returns NULL pointer on error
*ElfBinary elf_bin_new(const char *filename);

// either way, you can get more information about the error message with this
//
// pass a pointer to an ElfBinary that was constructed to get the last error from its call.
// pass NULL pointer in order to get the last error from a global call, like "new."
const char *elf_bin_error(*ElfBinary e);

Suffice it to say, I'm trying to avoid doing what some libraries do, which is to overwrite errno, or interact with any C global state.

MarcoIeni commented 3 years ago

always copy all pointers to a Rust-owned value ASAP, and then never touch the pointers again, no matter what type they were. That is what I wanted to discourage.

This is quite nice, we want to explain it in the book?

Anyway here you are talking about another part of the book (error handling), which is not really related to this PR, right? Because if you want once we solve the comments I left we could merge this and address other problems in separate PRs/issues.

jhwgh1968 commented 3 years ago

always copy all pointers to a Rust-owned value ASAP, and then never touch the pointers again, no matter what type they were. That is what I wanted to discourage.

Anyway here you are talking about another part of the book (error handling), which is not really related to this PR, right?

I was talking about what motivated a previous example in Accepting Strings, which in retrospect was very convoluted.

It is a general idea, but Accepting Strings was where I put it, because that's where it was most notable in my early code. Strings come in as const c_char *, and I was copying them into Rust owned Strings. I knew it was a lot of work, but I thought it was "safer," when I was worried about splitting ownership of the old C code. That is what I wanted to discourage.

I do not know whether it is worth a mention in Accepting Strings or not, if it is moved to a general principle. I would approve this PR either way.

simonsan commented 3 years ago

@jhwgh1968 @Michael-F-Bryan Hey you both, any updates on this? Asking because we're probably changing the patterns subfolder structure so ffi has it's own subfolder, this would make a rebase needed probably with some conflicts to solve. Just for letting you know. ;)

Michael-F-Bryan commented 3 years ago

I'm going to close this PR because I don't have the time to complete it and didn't really want to invest much more than that initial review.

MarcoIeni commented 3 years ago

Hi Michael, I reopen this and I assign it to me. I will continue the work you have started :)

MarcoIeni commented 3 years ago

Merged this! If you think something valuable from the above comments needs to be discussed further, please open an issue or another PR!