rust-lang / libs-team

The home of the library team
Apache License 2.0
110 stars 18 forks source link

ACP: Introduce checked_split_at{,_mut} methods #308

Closed mina86 closed 7 months ago

mina86 commented 7 months ago

Proposal

Problem statement

While indexing of a slice has a non-panicking counterpart in the form of get method, no such counterpart exists for split_at method. This forces users (who wish not to panic) to explicitly check for length which is something that the split_at function is already doing. This makes caller code more verbose, marginally more error-prone as the mid point is passed twice and potentially harder to reason about panicking behaviours.

Motivating examples or use cases

Proposed methods don’t offer any kind of ground breaking feature and the same effect can be achieved by just a couple additional statements so the motivating examples are by no means impressive. Nonetheless, they represent that occasionally codebases need to check the length.

From nearcore:

    if key.len() < 8 {
        return Err(io::Error::other(/* ... */));
    }
    let (shard_uid_bytes, trie_key) = key.split_at(8);

Could be rewritten as:

    let (shard_uid_bytes, trie_key) = key
        .checked_split_at(8)
        .ok_or_else(|| io::Error::other(/* ... */))?;

From Solana:

    let accounts = iter.as_slice();
    if accounts.len() < count {
        return Err(ProgramError::NotEnoughAccountKeys);
    }
    let (accounts, remaining) = accounts.split_at(count);

Could be rewritten as:

    let (accounts, remaining) = iter
        .as_slice()
        .checked_split_at(count)
        .ok_or(ProgramError::NotEnoughAccountKeys)?;

From CosmWasm:

        .filter_map(|name| {
            if name.len() > REQUIRES_PREFIX.len() {
                let (_, required_capability) = name.split_at(REQUIRES_PREFIX.len());
                Some(required_capability.to_string())
            } else {
                None
            }
        })

Could be rewritten as:

        .filter_map(|name| {
            name
                .checked_split_at(REQUIRES_PREFIX.len())
                .map(|(_, required_capability) required_capability.to_string())
        })

I’m currently working on code which looks as follows:

        // Check unused bytes after the slice.
        let (bytes, tail) = stdx::checked_split_at(bytes, bytes_len)?;
        if !tail.iter().all(|&byte| byte == 0) {
            return None;
        }

There, bytes argument must by at least length bytes_len and tail must be all zeros.

Solution sketch

Addition of checked_split_at{,_mut} methods to slice and str types. See https://github.com/rust-lang/rust/pull/118578.

Alternatives

  1. Do nothing. The feature can be achieved by (mid <= slice.len()).then(|| slice.split_at(mid) however priori art for convenience non-panicking methods exists (namely get method) and, at least to me, it feels like split_at falls within same category.
  2. Rather than having checked_split_at have a clamping_split_at which would clamp the mid point to the length of the slice. This is however less versatile.

Links and related work

This was mentioned in an old thread though the discussion fizzled out without any conclusion.

What happens now?

This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.

Possible responses

The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):

Second, if there's a concrete solution:

pitaj commented 7 months ago

Why checked_split_at vs split_at_checked?

mina86 commented 7 months ago

I followed naming of integer methods, e.g. checked_add. Another alternative is try_split_at.

pitaj commented 7 months ago

Well there's already split_at_unchecked though the naming there could change since it's unstable

tgross35 commented 7 months ago

+1 for split_at_checked over checked_split_at, related names are much more discoverable if they start with the same thing as their root function

Or try_split_at

jdahlstrom commented 7 months ago

Arithmetic types have modifier_verb methods (checked_add, overflowing_add) but the slice API is consistently verb_modifier (get_unchecked, chunks_exact, sort_unstable) so the latter convention seems to be the way to go.

m-ou-se commented 7 months ago

We briefly discussed this in last week's libs-API meeting. We're happy to see this as an unstable feature. The exact name can be figured out as an open question on the tracking issue before stabilization.

Feel free to open a tracking issue and add it to your PR. Thanks!

mina86 commented 7 months ago

@m-ou-se, great news, thanks! Quick question though: https://github.com/rust-lang/rust/pull/118578 adds the methods to [T] and str and uses two different features (slice_checked_split_at and str_checked_split_at). Should I a) merge those two features into one, b) file single tracking issue for both or c) file two tracking issues?

Edit: Went with a).