Stenway / RSV-Challenge

RSV = Rows of String Values
https://www.stenway.com
Other
90 stars 15 forks source link

[Rust impl] A less imperative implementation with iterators? #1

Open Ciel-MC opened 9 months ago

Ciel-MC commented 9 months ago

Hi, I liked your video and wanted to check the Rust impl out, and I felt that it could be made better. Here's what I came up with, this uses thiserror for a proper error type, but a Box<dyn Error> should work fine as well.

#[derive(Debug, PartialEq, Eq, thiserror::Error)]
enum RSVError {
    #[error("incomplete document")]
    IncompleteDocument,
    #[error("incomplere row")]
    IncompleteRow,
    #[error("invalid string")]
    InvalidString(#[from] std::string::FromUtf8Error),
}

fn decode_rsv(bytes: &[u8]) -> Result<Vec<Vec<Option<String>>>, RSVError> {
    macro_rules! assert_last_byte_if_exists {
        ($value:expr, $expected:expr, $error:expr) => {
            if !matches!($value.last(), Some($expected) | None) {
                return Err($error);
            }
        };
    }
    assert_last_byte_if_exists!(bytes, 0xFDu8, RSVError::IncompleteDocument);
    bytes
        .split(|b| *b == 0xFD)
        .map(|line| {
            assert_last_byte_if_exists!(line, 0xFFu8, RSVError::IncompleteRow);
            line.split(|b| *b == 0xFF)
                .map(|field| {
                    Ok(match field {
                        [0xFE] => None,
                        bytes => Some(String::from_utf8(bytes.to_vec())?),
                    })
                })
                .collect()
        })
        .collect()
}

fn encode_rsv(fields: &[&[Option<String>]]) -> Vec<u8> {
    fields
        .iter()
        .flat_map(|&line| {
            line.iter()
                .map(Option::as_ref)
                .flat_map(|field| {
                    field
                        .map_or(&[0xFE][..], |item| item.as_bytes())
                        .iter()
                        .chain(once(&0xFF))
                })
                .chain(once(&0xFD))
        })
        .copied()
        .collect()
}

I also notice that methods that already return R<T, B> doesn't catch IO errors, like

fn load_rsv(file_path: &str) -> Result<Vec<Vec<Option<String>>>, Box<dyn Error>> {
    let bytes = fs::read(file_path).expect("Could not load RSV file");
    return decode_rsv(bytes);
}

Could just be

fn load_rsv(file_path: &str) -> Result<Vec<Vec<Option<String>>>, Box<dyn Error>> {
    Ok(decode_rsv(fs::read(file_path)?)?) // Second ? needed assuming decode_rsv is using a concrete error type
}
ParkSnoopy commented 9 months ago

Using dyn increments runtime cost. Since we can predefine possible errors, no other error possibly returned by this procedure, it don’t need to use dyn as far as I think, maybe adding ‘InvalidFile’ error to enum is enough? Also, dyn error::Error makes error type less expressive.

Anyways, rust is not my primary language, possibly Im wrong, though