jdesrosiers / silex-cors-provider

A silex service provider that adds CORS services to silex
MIT License
78 stars 25 forks source link

domainToRegex does not allow protocol on wildcards #37

Closed brettt89 closed 6 years ago

brettt89 commented 6 years ago
    private function domainToRegex($domain)
    {
        return "/^" . preg_replace("/^\\\\\*/", "[^.]+", preg_quote($domain, "/")) . "$/";
    }

This regex only allows for wildcards without protocol (e.g. *.website.com).

The ^ regex should be remove to allow users to provide protocol for wildcard domains.

E.g. return "/^" . preg_replace("/\\\\\*/", "[^.]+", preg_quote($domain, "/")) . "$/";

jdesrosiers commented 6 years ago

Thanks for bringing this up! I didn't have a wildcard domain to try this out on when I implemented this feature, so I'm not surprised I missed something.

You're right that the protocol needs to be taken into account, but simply removing ^ would be too permissive. A wildcard domain should only match one level of subdomains. For example, given *.example.com, foo.example.com should match, but bar.foo.example.com should not.

I think the way this should work is to require that the protocol be specified before the wildcard domain.

These should be valid

These should not be valid

The preg_replace regex will need a lookahead and a lookbehind so that it only replaces the * if it is preceded by :// and followed by ..

jdesrosiers commented 6 years ago

After looking into this a bit more, it seems that the code does work as is. So, I'm not sure what problem you are encountering.

I do still like the idea of specifying the protocol though. I think it would have to be optional in order to not break backward compatibility.

brettt89 commented 6 years ago

In your preg_replace for wildcards currently uses the regex "/^\\\\\*/". The first character ^ matches the start of a string, so it expects the domain to start with a * rather than http:// or https://.

E.g.

These are being parsed as expected http://test.example.com https://test.example.com *.example.com

These are not being parsed as expected http://*.example.com https://*.example.com

This is because you are calling preg_quote on the domain which would convert https://*.example.com into https\://\*\.example\.com.

The unescaped regex being used is '^\\\*' which literally is looking for a string starting with \*. This would work if you were stripping the protocol off the domain to begin with, however I think its a good idea to have protocol based wildcards as most CORS are restricted to https only.

jdesrosiers commented 6 years ago

OK, I think we are getting on the same page now. When I implemented this feature, I intended it to be a wildcard domain only. It was not intended to allow you to specify the protocol. However, I'm up for adding that functionality. I'm not sure why you say "most CORS are restricted to https only", but it certainly is an important use case.

I have two requirements for adding this functionality.

  1. It should not break backwards compatibility. *.example.com should still work and match on any protocol.
  2. Other than having a protocol in front of it, the domain part should be subject to the same constraints as wildcard domains. See the "These should not be valid" list above for a list of domains that should not work, but would if all we do is remove ^.
jdesrosiers commented 6 years ago

I added this feature. Thanks for bringing up the issue!