RustCrypto / formats

Cryptography-related format encoders/decoders: DER, PEM, PKCS, PKIX
233 stars 122 forks source link

der: Simplify Reader trait: peek_8_bytes #1279

Open dishmaker opened 8 months ago

dishmaker commented 8 months ago

Continuation from https://github.com/RustCrypto/formats/issues/1228

Reader trait contains too many methods. All reader methods should return bytes for interoperability and ease of implementation.

Currently peek_header returns der::Header which is a library struct (cyclic dependency). I think it should be read by der crate itself, not the user crate. https://github.com/RustCrypto/formats/blob/086ecd7d27aa3f6c5c97dc2a33adc7159008086b/der/src/reader.rs#L26-L30

I've simplified peek_header and peek_byte into one function.

/// Peek at most 8 bytes (3 byte tag + 5 length)
fn peek_bytes(&self) -> &[u8];

Dependency and clone

Also, peek_header is a cyclic dependecy, because reading it requires another reader. PemReader accompishes it by .clone()ing itself. That's a bad idea, because it enforced the usage of RefCell - to be clone'able.

pub struct PemReader<'i> {
    /// Inner PEM decoder wrapped in a BufReader.
    reader: RefCell<utils::BufReader<'i>>,

I my implementation there's no need for runtime borrow checks. https://github.com/dishmaker/formats/blob/master/der/src/reader/pem.rs#L132

pub struct PemReader<'i> {
    /// Inner PEM decoder wrapped in a BufReader.
    reader: utils::BufReader<'i>,

As you can see, everything is simpler with peek_bytes.

Tests

PEM works https://github.com/dishmaker/formats/commit/8a49da40fad44200e9a9b56b0d3a88e001077b89

dishmaker@dm:~/formats$ cargo test -p der 
Running tests/pem.rs (target/debug/deps/pem)

running 3 tests
test to_pem ... ok
test from_pem ... ok
test from_pem_peek ... ok
tarcieri commented 7 months ago

Reader trait contains too many methods.

If you're suggesting it has too many required methods, that's reasonable. But the existing peek_* methods should be retained as provided methods.

As for this:

/// Peek at most 8 bytes (3 byte tag + 5 length)
fn peek_bytes(&self) -> &[u8];

It's pretty unclear/counterintuitive from the name peek_bytes what the semantics are going to be, especially someone encountering the function name in source code as opposed to reading documentation. I would personally expect a peek_bytes function to take a number of bytes to peek, and if it didn't, it would be confusing as to how many it might return.

This is the reason why I didn't expose it in the first place: you're leaking semantics of how the peek buffer is implemented, and peek_bytes doesn't tell you what they are.

A name that specifically identifies the function is returning the contents of the peek buffer, e.g. peek_buffer, might be clearer, but there are magic numbers you are encoding into the function's contract.

dishmaker commented 7 months ago

there are magic numbers you are encoding into the function's contract.

I agree. The only solution i see here is naming fn peek_up_to_8_bytes

However, for most use cases, it would never return more than 4 (1 byte tag + 3 byte length).

Naming things is hard... maybe peek_tag_length ?

dishmaker commented 2 months ago

there are magic numbers you are encoding into the function's contract.

I don't see any other way to make it clearer. Such function name may seem stupid, but if it works, then it's not stupid.

/// Peek at most 8 bytes (3 byte tag + 5 length)
fn peek_8_bytes(&self) -> &[u8];