jordanbray / chess

A rust library to manage chess move generation
https://jordanbray.github.io/chess/
MIT License
231 stars 54 forks source link

Improper use of MaybeUninit in code is undefined behaviour #51

Closed analog-hors closed 1 year ago

analog-hors commented 3 years ago

The documentation of MaybeUninit::as_mut_ptr mentions that

Reading from this pointer or turning it into a reference is undefined behavior unless the MaybeUninit<T> is initialized.

This is exactly what happens in all usages of MaybeUninit throughout the library: https://github.com/jordanbray/chess/blob/ad138579074ca1b0a88514626016d8e0b72d4ed3/src/board.rs#L864-L868 https://github.com/jordanbray/chess/blob/ad138579074ca1b0a88514626016d8e0b72d4ed3/src/movegen/movegen.rs#L229-L233 https://github.com/jordanbray/chess/blob/ad138579074ca1b0a88514626016d8e0b72d4ed3/src/movegen/movegen.rs#L256-L260 https://github.com/jordanbray/chess/blob/ad138579074ca1b0a88514626016d8e0b72d4ed3/src/movegen/movegen.rs#L264-L268

Soveu commented 3 years ago

Board contains

It is very dangerous to deref this structure without initializing it Rust assumes that Color will be either Black or White Or the tag from Option<u8> will be always either Some or None Also, LLVM can just mark the structure with undef and yeet out all the code that uses it, just because it is UB

EDIT: Also CastleRights is an enum There shouldn't be a real performance impact on initializing a ~100 bytes structure, the cost of code being (potentially) completely broken I think just outweighs it

jordanbray commented 3 years ago

There are several unsound issues here that I will take a look at this week. Don't view my lack of comments on the other issues as meaning I have ignored them.

The performance impact of initializing the Board structure is actually pretty measurable. Every instance of unsafe in this codebase is backed up by performance numbers.

That's not to say (or indicate) that the safety issue should be ignored. It shouldn't. But, the goal of this library is to generate moves as fast as possible, and this library exists in that environment.

jordanbray commented 3 years ago

@Soveu I was wrong. You said:

There shouldn't be a real performance impact on initializing a ~100 bytes structure, the cost of code being (potentially) completely broken I think just outweighs it

I responded:

The performance impact of initializing the Board structure is actually pretty measurable. Every instance of unsafe in this codebase is backed up by performance numbers.

I was wrong. Well, I was right. The compiler seems to have gotten better. By my tests, the compiler optimizes this correctly as of v 1.45. As a result, the board.make_move() function can be deprecated in favor of board.make_move_new().

I will need to verify some of this a bit more.

EDIT: by that I mean, board.make_move_new() can just use a Board::new() to do it's work, not an uninitialized board I believe.

Soveu commented 3 years ago

Well now I have similar problem when working on hyper - passing uninitialized data to a function that just writes. Here I would suggest "copy-pasting" fn make_move to some sort of fn make_move_uninit that accepts &mut MaybeUninit<Board> and promise no garbage data is written. The make_move would look like this

pub fn make_move(&self, m: ChessMove, result: &mut Board) {
  // SAFETY: make_move_uninit promises to not write garbage data
  unsafe {
    let result: *mut Board = result;
    let result = result as *mut MaybeUninit<Board>;
    return make_move_uninit(self, m, &mut *result);
  }
}
jordanbray commented 3 years ago

I could do that, but perhaps a better way would be to copy-paste the contents of make_move into make_move_new.

So, the new version would look like this:

pub fn make_move_new(&self, m: ChessMove) -> Board {
    let mut next_pos = Board::new();

    // Contents of make_move

   next_pos
}

pub fn make_move(&self, m: ChessMove, result: &mut Board) {
    *result = self.make_move_new(m);
}

The reason it isn't written this way currently, is because at the end of make_move_new, rust was copying the position, even though there's no reason for a copy there. This is an optimization called named return-value optimization, and is never promised (in any language).

However, it does seem that current rust compilers do have support for this. By my (very lazy) experiments, it seems to handle this sort of thing around 1.45.

In addition, the Board::new() in my code above should be effectively optimized out, as I believe the compiler will eliminate the unnecessary writes that happen there.

I still need to fact-check a lot of that by looking at the assembly a bit more. I only ran it on some simpler examples, not on this specific codebase while investigating, but at the moment I believe everything I have said is sound. I reserve the right to change my mind if I'm wrong, lol.

jordanbray commented 1 year ago

This has been completed.