python / peps

Python Enhancement Proposals
https://peps.python.org
4.42k stars 1.52k forks source link

PEP XXXX: Add `replace` method to list #4075

Closed PythonSJL closed 1 day ago

PythonSJL commented 1 day ago

Title: PEP XXXX: Add replace method to list

Description: This Issue proposes the addition of a method to Python's built-in list type that allows for the replacement of subsequences within a list. The proposed method name is replace, but alternative names are welcome if there is a concern about ambiguity.

Motivation: This functionality is commonly required and would simplify coding tasks by avoiding custom implementations or third-party libraries.

Use Cases:

Proposal: The detailed proposal is included in the attached file pep-xxxx-add-replace-method-to-list.rst.

Compatibility: The proposal is backward-compatible.

Discussion Points:

Feedback Requested: Community feedback is welcome, especially regarding the method name and any suggestions for improvement.

Tags: PEP, Draft, Type: Standards Track

(If the file was not successfully uploaded, please use the following content instead)

PEP: XXXX

Title: Add replace method to list

Author: Jiayi Shen 2261748025@qq.com

Status: Draft

Type: Standards Track

Created: 2024-10-21

Abstract

This PEP proposes adding a new method replace to the built-in list type in Python. This method will allow replacing occurrences of a subsequence in a list with another subsequence, optionally limiting the number of replacements. This functionality is commonly needed and adding it to the built-in list type would simplify code and improve readability.

Motivation

Replacing subsequences in a list is a task that frequently arises in various programming scenarios. Currently, this requires writing custom functions or using third-party libraries, which can be cumbersome and less efficient. By adding this functionality directly to the built-in list type, we can streamline common operations and make Python more user-friendly.

Use Cases

  1. Data Cleaning: Removing or replacing invalid data sequences in a list.
  2. Text Processing: Replacing specific patterns in a list of characters or strings.
  3. Algorithm Implementation: Simplifying the implementation of algorithms that require subsequence replacement.

Specification

The replace method will have the following signature:

.. code-block:: python def replace(self, old: list, new: list, count: int = -1) -> list: """ Replace occurrences of the subsequence 'old' in the list with 'new'. :param old: The subsequence to be replaced. :param new: The subsequence to replace with. :param count: The maximum number of replacements to perform. Default is -1 (unlimited). :return: The modified list with replacements. """

Behavior

Implementation

The reference implementation of the replace method uses a modified version of the Knuth-Morris-Pratt (KMP) string matching algorithm to efficiently find and replace occurrences of the subsequence old in the list seq. The KMP algorithm is chosen for its efficiency in handling cases where the subsequence to be replaced may have overlapping patterns. The KMP algorithm preprocesses the old subsequence to create a "partial match" table (also known as the "failure function"), which is used to skip unnecessary comparisons. When a mismatch occurs, the table indicates the next position in the old subsequence to continue the comparison, thus avoiding redundant checks.

Reference Python Implementation

.. code-block:: python def replace(seq: list, old: list, new: list, count: int = -1) -> list: if not isinstance(seq, list): raise TypeError("Parameter 'seq' must be a list, got {}".format(type(seq).name)) if not isinstance(old, list): raise TypeError("Parameter 'old' must be a list, got {}".format(type(old).name)) if not isinstance(new, list): raise TypeError("Parameter 'new' must be a list, got {}".format(type(new).name)) if not old: raise ValueError("Parameter 'old' must not be empty") if not isinstance(count, int) or count < 0 and count != -1: raise ValueError("Parameter 'count' must be a non-negative integer or -1, got {}".format(count)) if count == 0: return seq[:] old_len = len(old) table = [0] * old_len pos, cnd = 1, 0 while pos < old_len: if old[pos] == old[cnd]: cnd += 1 table[pos] = cnd pos += 1 elif cnd > 0: cnd = table[cnd - 1] else: table[pos] = 0 pos += 1 result = [] i, j = 0, 0 replaced = 0 seq_len = len(seq) while i < seq_len: if j == old_len: result.extend(new) replaced += 1 j = 0 if count != -1 and replaced == count: result.extend(seq[i:]) break elif seq[i] == old[j]: i += 1 j += 1 else: if j != 0: result.extend(old[:j]) j = table[j - 1] else: result.append(seq[i]) i += 1 if j == old_len: result.extend(new) elif j > 0: result.extend(old[:j]) return result

Request for C Implementation

The Python reference implementation serves as a proof of concept. However, for optimal performance, it is recommended that the replace method be implemented in C. This will allow the method to run at C speed, which is significantly faster than Python's interpreted speed, especially for large lists. The C implementation should maintain the same algorithmic approach as the Python reference, ensuring that the method's behavior remains consistent. The C code should be written to integrate seamlessly with Python's C API and should follow Python's coding standards and best practices. The C implementation is expected to handle the following:

Compatibility

This change is backward-compatible as it adds a new method without modifying existing behavior. Existing code that does not use this method will remain unaffected.

Discussion

The choice of the KMP algorithm and the request for a C implementation are open to discussion within the Python community. Feedback is welcome regarding the algorithm's suitability, potential optimizations, and the overall design of the replace method.

Decision

The final decision on the algorithm and implementation language will be made based on community feedback and core developer review.

Copyright

This document has been placed in the public domain.

AA-Turner commented 1 day ago

Thank you for opening this issue. Proposing a PEP should be done in a PR with the text, so I'll close this for now. PEPs also require a core developer or PEP editor who is willing to sponsor it. I would advise you to read PEPs 1 and 12, and then start a thread on the https://discuss.python.org/ forum.

A