Vectorized / solady

Optimized Solidity snippets.
MIT License
2.51k stars 339 forks source link

✨ Add `is7BitASCII(s, allowed)` to LibString #963

Closed atarpara closed 3 months ago

atarpara commented 3 months ago

Description

closing #962

Checklist

Ensure you completed all of the steps below before submitting your pull request:

Pull requests with an incomplete checklist will be thrown out.

0xCLARITY commented 3 months ago

Left a more detailed comment on the issue: https://github.com/Vectorized/solady/issues/962#issuecomment-2176574758, but I think it would be surprising for spaces to be considered "alphanumeric".

Vectorized commented 3 months ago

Refactored to make it more general.

Will merge in 24 hours if no hard objections.

For compile time evalaution:

function is7BitASCIIAlphaNumericWithSpace(string memory s) internal pure returns (bool result) {
    return LibString.is7BitASCII(
        s, 
        LibString.ALPHANUMERIC_7_BIT_ASCII | uint128(1 << uint256(uint8(bytes1(" "))))
    );
}

Btw, we have a LibString.escapeHTML(s).

atarpara commented 3 months ago

@Vectorized In the current implementation of the is7BitASCII(string memory s, uint128 allowed) function, The behavior for empty strings is to return true in all cases. This is inconsistent with the behavior observed in similar functions across various programming languages.

For instance:

Python isalnum Rust's is_ascii_alphanumeric Govalidator's alphanumeric

In these implementations, the function typically returns false for an empty string. To align with these standards and ensure consistent behavior, I propose the following adjustments:

  1. The function will return false for empty strings, aligning with the default behavior in other languages.
  2. If allowing empty strings is necessary for specific use cases, a separate function can be created. Alternatively, developers can handle this explicitly by checking the length of the string using bytes(str).length == 0.

Everyone is open to give your valuable suggestions.

Vectorized commented 3 months ago

Let's deviate away from the popular languages.

For this example, it is logically closer to:

Python

all(c in "0123456789" for c in "") # True

Rust

fn is_alphanumeric(s: &str) -> bool {
    s.chars().all(|c| c.is_alphanumeric())
}

Note that Rust's is_alphanumeric operates on characters, not strings.

Rust's philosophy is to provide the most fundamental building blocks which can be composed together via terse syntax.

Due to limited space in Solady, it would be better to lean towards Rust's philosophy.

Also, to keep consistency with LibString.is7BitASCII("") == true. Making this false will feel weird.

Python's isalnum and Govalidator's alphanumericapproach the problem from a HTML input or regex perspective.

isalnum is a super ancient method in Python since the 1.6 days, and the programming paradigm back then was to provide functions that are specialized for HTML input validation instead of for general purpose.

See: https://www.php.net/manual/en/function.ctype-alnum.php