encode / starlette

The little ASGI framework that shines. 🌟
https://www.starlette.io/
BSD 3-Clause "New" or "Revised" License
10.2k stars 921 forks source link

Allow regex patterns for route path #1101

Closed ar45 closed 2 years ago

ar45 commented 3 years ago

Checklist

Is your feature related to a problem? Please describe.

I want to be able to define generic routes which match more specific criteria. Creating a Converter for each instance just to be able to use a different regex seems a bit too much bloat.

One example, consider I would like my route to match optional suffix /post/{id}/attachments/{attachment_id}.suffix? so /post/1/attachments/22.mp4 and /post/1/attachments/22 both match but /post/1/attachments/bogus should not

Another use case is to be able to control case sensitivity on some urls. There is infinite power that comes with being able to provide regex patterns.

Describe alternatives you considered

Alternatively, creating a custom converter for each pattern, but still does not provide control over regex flags. Creating Custom converters for each pattern

class AttachmentConverter(StringConverter):
    regex = '\d+(?:\.mp4|\.mp3)?'

Describe the solution you would like.

What I ended up doing is patching compile_path to accept a RegexPath.

@router.get(RegexPath(rf'/(?P<greet>hello|hi)/(?P<count>{IntegerConvertor.regex})/(?P<id_>{UUIDConvertor.regex})'))
async def greet(count: int, greet: str, limit: int, request: Request):
   ...

# limit is an integer but must be greater than 100.
@router.get(RegexPath(r'/(?P<greet>hello|hi)/(?P<count>\d+)/(?P<limit>1[0-9][0-9]+)'))
async def greet(count: int, greet: str, limit: int, request: Request):

# Optionally supply Converters otherwise it defaults to `StringConverter`
@router.get(RegexPath(r'/(?P<greet>hello|hi)/(?P<count>\d+)/(?P<limit>1[0-9][0-9]+)', path_converters={'limit': IntegerConverter()}))

# Passing Compiled regex with flags
regex = re.compile(r'/(?P<greet>hello|hi)/(?P<count>\d+)/(?P<limit>1[0-9][0-9]+)', re.IGNORECASE)
@router.get(RegexPath(regex, path_converters={'limit': IntegerConverter()}))

Additional context

ar45 commented 3 years ago

Bellow is an implementation that router.url_for_path(...) kinda works, the problem is that url_for_path does not check the parameters match the regex. which effects the current parameter based paths as well.

import re
import typing
from typing import Union, Dict

from starlette.convertors import StringConvertor, Convertor
from starlette.routing import compile_path as starlette_compile_path

class RegexPath(str):
    PARAM_REGEX = re.compile(r"\(\?P<([a-zA-Z_][a-zA-Z0-9_]*?)>[^)]+\)")

    def __new__(cls, value, *args, **kwargs):
        return super().__new__(cls, value)

    def __init__(self, path: Union[str, re.Pattern], flags=0, path_converters: Dict[str, Convertor] = None):
        super().__init__()
        if isinstance(path, re.Pattern):
            flags = path.flags | flags
            path = path.pattern

        self.flags = flags
        self.pattern = self.strip_regex_start_end(path)
        self.path_regex = re.compile('^{}$'.format(self.pattern), flags)
        self.path_converters = path_converters or {}
        self.path_format = self.get_path_format(self.pattern)

    def __add__(self, other):
        pattern, flags, path_converters = self._cast(other)
        return RegexPath(self.pattern + pattern, flags, path_converters=self.path_converters)

    def __radd__(self, other):
        pattern, flags, path_converters = self._cast(other)
        return RegexPath(pattern + self.pattern, flags, path_converters=self.path_converters)

    def _cast(self, other):
        path_converters = self.path_converters.copy()
        flags = self.flags
        if isinstance(other, RegexPath):
            path_converters.update(other.path_converters)
            pattern = other.pattern
            flags |= flags
        else:
            pattern = other
        return pattern, flags, path_converters

    def get_path_format(
            self, pattern: str
    ) -> str:
        path = self.unescape_re(pattern)

        path_format = ""

        idx = 0

        for match in self.PARAM_REGEX.finditer(path):
            param_name = match.group(1)
            path_format += path[idx: match.start()]
            path_format += "{%s}" % param_name
            idx = match.end()
            if param_name not in self.path_converters:
                self.path_converters[param_name] = StringConvertor()

        path_format += path[idx:]
        return path_format

    @staticmethod
    def strip_regex_start_end(pattern: str):
        if pattern.startswith('^'):
            pattern = pattern[1:]
        if pattern.endswith('$'):
            pattern = pattern[:-1]
        return pattern

    @staticmethod
    def unescape_re(string):
        return re.sub(r'(\\\\|\\)', lambda x: '\\' if x.group(0) == '\\\\' else '', string)

    def __str__(self):
        return self.path_format

    def __repr__(self):
        return repr(self.path_regex)

def compile_path(
        path: Union[str, RegexPath],
) -> typing.Tuple[typing.Pattern, str, typing.Dict[str, Convertor]]:
    """
    Given a path string, like: "/{username:str}", return a three-tuple
    of (regex, format, {param_name:convertor}).

    regex:      "/(?P<username>[^/]+)"
    format:     "/{username}"
    convertors: {"username": StringConvertor()}
    """
    if isinstance(path, RegexPath):
        return path.path_regex, path.path_format, path.path_converters
    else:
        return starlette_compile_path(path)

def monkey_patch_starlette_path_compiler():
    """This must be executed before any modules import compile_path from starlette"""
    import starlette.routing
    import fastapi.routing
    starlette.routing.compile_path = compile_path
    fastapi.routing.compile_path = compile_path
tomchristie commented 2 years ago

I'm going to close off this and https://github.com/encode/starlette/pull/581 for now.

I do think that at some point we probably want to expose the request.route and make associated info, such as the regex available. Eg. request.route.regex.

However at the moment we're already putting a bunch of stuff that's outside the ASGI spec into the scope. I don't want to add any more until we've properly audited that, and also don't want any new functionality until we've got the existing tickets all well under control.

It's also very possible that there's actually more design questions lurking under the surface here once you start properly unpicking this issue (eg. what public API does/should Route expose? What about mounting?)

For now let's just file this under "we don't provide this".

henri9813 commented 1 year ago

Hello,

I understand you don't provide this, however, this is a "needed features" we have such as for "path validation".

Currently,

We are building an API which permit to:

Without Regex, we can only have: /students/{XXXX} and do the "switch" in the controller logic,

However, it would interest us to have: /students/{[0-9]{6}:id} and another route with "regex path": /students/{.*@.*:mail}.

Does your opinion has changed since one year ?

Best regards,

cerealkill commented 1 month ago

Hello,

Same here, current patch-matching logic does not support my api needs. It would be a really cool feature to expose that to the fastapi route developer.

Thank you in advance.