Stenway / RSV-Challenge

RSV = Rows of String Values
https://www.stenway.com
Other
89 stars 15 forks source link

[Rust impl] A less imperative implementation with iterators? #1

Open Ciel-MC opened 10 months ago

Ciel-MC commented 10 months ago

Hi, I liked your video and wanted to check the Rust impl out, and I felt that it could be made better. Here's what I came up with, this uses thiserror for a proper error type, but a Box<dyn Error> should work fine as well.

#[derive(Debug, PartialEq, Eq, thiserror::Error)]
enum RSVError {
    #[error("incomplete document")]
    IncompleteDocument,
    #[error("incomplere row")]
    IncompleteRow,
    #[error("invalid string")]
    InvalidString(#[from] std::string::FromUtf8Error),
}

fn decode_rsv(bytes: &[u8]) -> Result<Vec<Vec<Option<String>>>, RSVError> {
    macro_rules! assert_last_byte_if_exists {
        ($value:expr, $expected:expr, $error:expr) => {
            if !matches!($value.last(), Some($expected) | None) {
                return Err($error);
            }
        };
    }
    assert_last_byte_if_exists!(bytes, 0xFDu8, RSVError::IncompleteDocument);
    bytes
        .split(|b| *b == 0xFD)
        .map(|line| {
            assert_last_byte_if_exists!(line, 0xFFu8, RSVError::IncompleteRow);
            line.split(|b| *b == 0xFF)
                .map(|field| {
                    Ok(match field {
                        [0xFE] => None,
                        bytes => Some(String::from_utf8(bytes.to_vec())?),
                    })
                })
                .collect()
        })
        .collect()
}

fn encode_rsv(fields: &[&[Option<String>]]) -> Vec<u8> {
    fields
        .iter()
        .flat_map(|&line| {
            line.iter()
                .map(Option::as_ref)
                .flat_map(|field| {
                    field
                        .map_or(&[0xFE][..], |item| item.as_bytes())
                        .iter()
                        .chain(once(&0xFF))
                })
                .chain(once(&0xFD))
        })
        .copied()
        .collect()
}

I also notice that methods that already return R<T, B> doesn't catch IO errors, like

fn load_rsv(file_path: &str) -> Result<Vec<Vec<Option<String>>>, Box<dyn Error>> {
    let bytes = fs::read(file_path).expect("Could not load RSV file");
    return decode_rsv(bytes);
}

Could just be

fn load_rsv(file_path: &str) -> Result<Vec<Vec<Option<String>>>, Box<dyn Error>> {
    Ok(decode_rsv(fs::read(file_path)?)?) // Second ? needed assuming decode_rsv is using a concrete error type
}
ParkSnoopy commented 10 months ago

Using dyn increments runtime cost. Since we can predefine possible errors, no other error possibly returned by this procedure, it don’t need to use dyn as far as I think, maybe adding ‘InvalidFile’ error to enum is enough? Also, dyn error::Error makes error type less expressive.

Anyways, rust is not my primary language, possibly Im wrong, though