python / cpython

The Python programming language
https://www.python.org
Other
63.41k stars 30.36k forks source link

Add scgi to urllib.parse.uses_netloc #67824

Open ea7fbdbd-aa70-4c6e-891c-8c8ae851ddcd opened 9 years ago

ea7fbdbd-aa70-4c6e-891c-8c8ae851ddcd commented 9 years ago
BPO 23636
Nosy @orsenthil, @vadmium
Files
  • py3bug: Simple testcase demonstrating problem
  • py2bug: Simple testcase demonstrating problem
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = None created_at = labels = ['type-bug', 'library'] title = 'Add scgi to urllib.parse.uses_netloc' updated_at = user = 'https://bugs.python.org/anthonyryan1' ``` bugs.python.org fields: ```python activity = actor = 'ned.deily' assignee = 'none' closed = False closed_date = None closer = None components = ['Library (Lib)'] creation = creator = 'anthonyryan1' dependencies = [] files = ['38434', '38435'] hgrepos = [] issue_num = 23636 keywords = [] message_count = 2.0 messages = ['237831', '237837'] nosy_count = 3.0 nosy_names = ['orsenthil', 'martin.panter', 'anthonyryan1'] pr_nums = [] priority = 'normal' resolution = None stage = None status = 'open' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue23636' versions = ['Python 2.7', 'Python 3.4'] ```

    ea7fbdbd-aa70-4c6e-891c-8c8ae851ddcd commented 9 years ago

    The scgi protocol is not included in urllib.parse.uses_netloc list, while other less common protocols are (such as gopher).

    I would like to see scgi get added to this list.

    vadmium commented 9 years ago

    See also bpo-16134 for adding RTMP schemes to the registry. However, I wonder if it is time for a more general fix, rather than having an arms race with whatever URL scheme someone dreams up next.

    According to bpo-7904, urlsplit() etc intentially support parsing the //netloc part for arbitrary URL schemes. However when putting it back together, the urlunsplit() etc functions currently cannot assume it is okay to insert an empty netloc “//”, probably because it would break URLs like tel:+1234 or mailto:somebody@example.net.

    There are a bunch of URL-parsing issues floating around which may be able to come together to solve each other:

    In bpo-22852, I proposed adding a series of has_netloc/query/fragment flags to the SplitResult etc classes. If that was implemented, we would not have to worry about whitelisting “scgi:” or other schemes in “uses_netloc”. Instead, urlsplit() would automatically set SplitResult(has_netloc=True), and urlunsplit() would know to restore the empty “//” netloc string.