python / cpython

The Python programming language
https://www.python.org
Other
63.52k stars 30.43k forks source link

urlparse: add userinfo attribute #61113

Open 569263e0-2f5f-42c2-87a7-ad019dd43f68 opened 11 years ago

569263e0-2f5f-42c2-87a7-ad019dd43f68 commented 11 years ago
BPO 16909
Nosy @orsenthil
Files
  • urlparse_userinfo.diff: Patch adding a userinfo attribute
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['type-feature'] title = 'urlparse: add userinfo attribute' updated_at = user = 'https://bugs.python.org/olof' ``` bugs.python.org fields: ```python activity = actor = 'olof' assignee = 'none' closed = False closed_date = None closer = None components = ['None'] creation = creator = 'olof' dependencies = [] files = ['28649'] hgrepos = [] issue_num = 16909 keywords = ['patch'] message_count = 3.0 messages = ['179447', '179476', '179533'] nosy_count = 2.0 nosy_names = ['orsenthil', 'olof'] pr_nums = [] priority = 'normal' resolution = None stage = None status = 'open' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue16909' versions = ['Python 3.4'] ```

    569263e0-2f5f-42c2-87a7-ad019dd43f68 commented 11 years ago

    Hi,

    The urlparse library's "netloc" attribute is today further split into the following attributes: username, password, hostname, port. The attributes preceding the @ (username, password) are refered to in RFC 3986 [1] as "userinfo", the format of which is scheme dependant. E.g. the (expired) internet draft for SSH/SFTP urls [2] have connection parameters within the userinfo (user;cparams@host).

    In some cases, the deprecated "username:password" syntax is required to be supported even with, e.g. "connection parameters". For this reason, I propose a new attribute, "userinfo", that exposes the "raw" userinfo string, without any splitting on : etc. I've had a go at a patch, with updated unit tests and documentation. Any feedback is welcome!

    Regards,

    1: http://tools.ietf.org/rfc/rfc3986.txt 2: http://tools.ietf.org/id/draft-ietf-secsh-scp-sftp-ssh-uri-04.txt

    orsenthil commented 11 years ago

    If it does go in, due the RFC requirement, then it would be only in 3.4 (default branch) and the feature may not be backported. Without reading the RFC section, I have an intuitive -1 for this proposal because the suggestion may be a corner case rather than a default scenario. It is a bad idea to change parsing logic for corner cases.

    569263e0-2f5f-42c2-87a7-ad019dd43f68 commented 11 years ago

    Thank you for you feedback. I agree, the reason I wanted this was because of a corner case, but otoh, the username:password syntax is the real corner case imho. Of course, I understand that this must be supported for backwards compatability.

    (For fully RFC compliant URLs however, userinfo would be the same as user since : in userinfo isn't allowed, so again, you have a very valid point for your corner case argument.)

    The patch was developed against 2.7, so it won't apply on 3.4, but looking at 3.4, urlparse already has a _userinfo property method, but it splits the userinfo in a tuple of (username, password). It would be easy to adapt the change to 3.4, but I'll wait until I get additional feedback.