python / cpython

The Python programming language
https://www.python.org
Other
63.15k stars 30.24k forks source link

PyUnicode_FSDecoder() accepts arbitrary iterable #70941

Closed serhiy-storchaka closed 8 years ago

serhiy-storchaka commented 8 years ago
BPO 26754
Nosy @pitrou, @vstinner, @pjenvey, @vadmium, @serhiy-storchaka
Dependencies
  • bpo-26800: Don't accept bytearray as filenames part 2
  • Files
  • PyUnicode_FSDecoder-no-list.patch
  • PyUnicode_FSDecoder-deprecate-buffer.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = 'https://github.com/serhiy-storchaka' closed_at = created_at = labels = ['interpreter-core', 'type-bug'] title = 'PyUnicode_FSDecoder() accepts arbitrary iterable' updated_at = user = 'https://github.com/serhiy-storchaka' ``` bugs.python.org fields: ```python activity = actor = 'serhiy.storchaka' assignee = 'serhiy.storchaka' closed = True closed_date = closer = 'serhiy.storchaka' components = ['Interpreter Core'] creation = creator = 'serhiy.storchaka' dependencies = ['26800'] files = ['43340', '43451'] hgrepos = [] issue_num = 26754 keywords = ['patch'] message_count = 9.0 messages = ['263378', '263388', '263394', '263396', '263695', '268213', '268795', '268798', '272107'] nosy_count = 6.0 nosy_names = ['pitrou', 'vstinner', 'pjenvey', 'python-dev', 'martin.panter', 'serhiy.storchaka'] pr_nums = [] priority = 'normal' resolution = 'fixed' stage = 'resolved' status = 'closed' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue26754' versions = ['Python 3.6'] ```

    serhiy-storchaka commented 8 years ago

    PyUnicode_FSDecoder() accepts not only str and bytes or bytes-like object, but arbitrary iterable, e.g. list.

    Example:

    >>> compile('', [116, 101, 115, 116], 'exec')
    <code object <module> at 0xb6fb1340, file "test", line 1>

    I think accepting arbitrary iterables is unintentional and weird behavior.

    pitrou commented 8 years ago

    I agree this doens't make sense.

    vadmium commented 8 years ago

    I agree it is a bit strange. It looks like it is a victim of PyBytes_FromObject() doing more than it says; its documentation only mentions the buffer protocol, not accepting iterables.

    serhiy-storchaka commented 8 years ago

    PyUnicode_FSDecoder() is used in following functions in the stdlib:

    compile()
    symtable.symtable()
    parser.compile()
    parser.compilest()
    zipimporter.zipimporter()
    _imp.load_dynamic() (before 3.5)

    This is behavior of PyUnicode_FSDecoder() from the start (bpo-9542). All above functions accepted only str in 3.1, thus accepting bytes object and others was new feature.

    None tests are failed if reject non-str and non-bytes argument in PyUnicode_FSDecoder(). But none tests are failed even if disable support of bytes argument (there is a lack of tests for bytes path).

    What should we do?

    1. Add a warning when the argument neither str nor supporting the buffer protocol.

    2. Drop support of non-str and not supporting the buffer protocol arguments without a warning.

    3. Drop support of non-str and not supporting the buffer protocol arguments without a warning, and add a warning when the argument neither str nor bytes.

    4. Drop support of non-str and non-bytes arguments without a warning.

    pjenvey commented 8 years ago

    See bpo-26800 for reasoning to go with #4

    serhiy-storchaka commented 8 years ago

    Proposed patch makes PyUnicode_FSDecoder() rejecting arbitrary iterables.

    1762cc99-3127-4a62-9baf-30c3d0f51ef7 commented 8 years ago

    New changeset 2e48c2c4c733 by Serhiy Storchaka in branch '3.5': Issue bpo-26754: PyUnicode_FSDecoder() accepted a filename argument encoded as https://hg.python.org/cpython/rev/2e48c2c4c733

    New changeset e18ac7370113 by Serhiy Storchaka in branch 'default': Issue bpo-26754: PyUnicode_FSDecoder() accepted a filename argument encoded as https://hg.python.org/cpython/rev/e18ac7370113

    serhiy-storchaka commented 8 years ago

    Following patch deprecates the support of bytes-like objects (except bytes itself) in PyUnicode_FSDecoder() for consistency with bpo-26800.

    1762cc99-3127-4a62-9baf-30c3d0f51ef7 commented 8 years ago

    New changeset 818f22f9ab02 by Serhiy Storchaka in branch 'default': Issue bpo-26754: Undocumented support of general bytes-like objects https://hg.python.org/cpython/rev/818f22f9ab02