Nullus157 / bs58-rs

Another Rust Base58 codec implementation
Apache License 2.0
75 stars 24 forks source link

Truncating behavior is confusing and forces allocations #77

Closed matklad closed 3 years ago

matklad commented 3 years ago

I expect the following test to pass:

#[test]
fn append() {
    let mut buf = "hello world".to_string();
    bs58::encode(&[92]).into(&mut buf).unwrap();
    assert_eq!("hello world2b", buf.as_str());
}

Instead, it fails, as the buf contains just "2b". That is, encoding discards existing data, rather than appending to it.

There are two problems with it:

Nemo157 commented 3 years ago

Agreed. Should be a pretty easy change and I think worth a breaking release.

Nemo157 commented 3 years ago

Would you expect the same when decoding into a &mut Vec<u8> (vs &mut [u8])?

#[test]
fn append() {
    let mut buf = b"hello world".to_owned();
    bs58::decode("a").into(&mut buf).unwrap();
    assert_eq!(b"hello world!", buf.as_ref());
}

#[test]
fn no_append() {
    let mut buf = b"hello world".to_owned();
    bs58::decode("a").into(buf.as_mut()).unwrap();
    assert_eq!(b"!ello world", buf.as_ref());
}
matklad commented 3 years ago

For Vec<u8>, I'd expect expect the same behavior as for String -- append the end.

For &mut [u8], I'd expect the same behavior as char::encode_utf8 -- overwrite the prefix, return the str slice of the data actually written.

Nemo157 commented 3 years ago

Returning an &str would require checking/asserting utf-8 validity at that point, if you're doing more ASCII-only processing on the buffer (or never actually asserting it is a string) then you might want to delay that.

matklad commented 3 years ago

Hm, I think base58 guarantees that the encoded result is utf8, so no additional validation is necessary? If this assumption is correct, that returning &mut str allows the calling code to avoid utf8-validation and bounds checking. In any case, returning just usize signifying the amount of bytes written would be fine as well. Maybe retuning usize is even better: I wager that the main benefit for char's return type is not actaully eliding the check, but just basic conveniecne for cases where you'd want to encode char to a local [u8; 4], and then do something with the resulting string.

Nemo157 commented 3 years ago

Yeah, it wouldn't need validating since the API guarantees it's ASCII, but I want to minimize the unsafe code here (currently the only unsafe code used is the bare minimum necessary to actually work with &mut str).