Use String#byteindex instead of String#index

shugo commented 1 month ago

Because String#index takes O(N), where N is the second argument, while String#byteindex takes O(1).

shugo commented 1 month ago

We should wait until Ruby 3.1 gets EOL.

nevans commented 1 month ago

@shugo I added it to a new v0.6 milestone. 🙂

admsev commented 1 month ago

@gooroodev WDYT?

gooroodev commented 1 month ago

Thanks, @admsev! Happy to help!

1. Summary of Changes

The pull request makes the following changes:

Replaces instances of String#index with String#byteindex.
Replaces instances of $~.end(0) with $~.byteoffset(0)[1].

2. Issues, Bugs, and Typos

Issue 1: Inconsistent Handling of Return Values

The original code uses String#index which returns the character index, while the new code uses String#byteindex which returns the byte index. This change can affect the handling of multibyte characters. However, the code seems to handle byte offsets correctly by using $~.byteoffset(0)[1].

Issue 2: Typo in Comment

The comment "an 'accept' versiun of #text" contains a typo. The word "versiun" should be "version".

Proposed Fix:

# an "accept" version of #text

Issue 3: Consistency in Method Naming

The method accept_spaces uses byteindex while other methods like next_token still use index in some places. For consistency, all instances should be updated to use byteindex.

Proposed Fix: Replace all instances of index with byteindex in the next_token method.

3. General Review of Code Quality and Style

Code Quality

Readability: The code is generally readable and well-commented. The use of regular expressions is appropriate for the task.
Performance: The switch to byteindex can improve performance in handling multibyte strings, which is crucial for IMAP responses that may contain multibyte characters.
Error Handling: The code has appropriate error handling with parse_error calls to manage unexpected tokens.

Style

Consistency: The code style is consistent with Ruby conventions. Method names are clear and follow the snake_case convention.
Documentation: The code is well-documented with comments explaining the purpose of each method and the meaning of regular expressions.

Proposed Code Improvements

Improvement 1: Consistent Use of `byteindex`

Current Code:

@str.index(/\S*/n, @pos)

Improved Code:

@str.byteindex(/\S*/n, @pos)

Improvement 2: Fix Typo in Comment

Current Code:

# an "accept" versiun of #text

Improved Code:

# an "accept" version of #text

Improvement 3: Handle Multibyte Characters Consistently

Ensure all instances of index are replaced with byteindex where necessary to handle multibyte characters correctly.

Conclusion

The pull request effectively updates the code to use String#byteindex instead of String#index, which is a positive change for handling multibyte characters. The proposed improvements should be implemented to maintain consistency and fix minor issues. Overall, the code quality and style are good, and the changes are well-justified.

Yours, Gooroo.dev. I'd love to hear your feedback! React or reply.

nevans commented 1 month ago

@admsev @gooroodev please don't. It's obnoxious and not very helpful.

ruby / net-imap