Proposal to protect the /authorize endpoint for the Authorization Code Flow (Auth Code Flow) - RFC9101

mhfoo commented 4 months ago

Problem description The /authorize endpoint is not protected, and the following undesired scenarios are security concerns.

A bad actor can perform a DDoS attack with multiple Operator SIM cards on the Operator /authorize endpoint; this is an internal mobile network route without traversing the Internet.
Aggregators are using mobile apps to query different operator /authorize endpoints to discover (at no cost) which operator the SIM card belongs too.

Possible evolution Proposal to have a one-time use token to prevent abuse of this un-protected endpoint on AuthServer. The App Backend will need to obtain this one-time use token from the AuthServer and pass it to the Device App and then use it against the /authorize endpoint.

The one-time use token will be validated at the Auth Service, and therefore protecting the downstream network nodes.
The one-time use token will be associated with an Application Backend for charging purposes.

Alternative solution N.A

Additional context NumberVerification subproject issue 71

mhfoo commented 4 months ago

Please refer to the following sequence diagrams. The interactions in blue is the proposed method.

Standalone Open Gateway Proposal

Federated Open Gateway Simple Federated Proposal

garciasolero commented 4 months ago

In our view, it is unnecessary to introduce additional flows beyond those that already exist. Instead, we recommend using the mechanisms provided by the OAuth standard itself. Our proposal to mitigate this issue is to force the application to pass the request object parameter by value in the authorize request of the authorization code flow, as described in RFC9101.

By using a signed JWT, we prevent requests that are not generated by the application. Additionally, we would need to require that the JWT contains the jti (to prevent the JWT from being replayed) and exp parameters.

For applications, the integration cost would not be high because they would already be generating signed-JWTs to authenticate server-to-server requests using private_key_jwt.

AxelNennker commented 4 months ago

I commented on the original issue https://github.com/camaraproject/NumberVerification/issues/71#issuecomment-1964352387

mhfoo commented 4 months ago

@garciasolero Thanks for the proposal to enforce the application backend to pass the request object parameter by value in the authorize request of the authorization code flow, as described in RFC9101.

The request object will be a signed JWT containing the auth code flow request payload.

This will mitigate point 1. of the problem statements

This will require an Internet connection to fetch the jwks to verify the signature. Typically, the network authentication nodes only operate in the mobile core.

Please refer to the following diagram

mhfoo commented 4 months ago

@garciasolero The point above: relying on the jti value alone when the signed JWT is from a third party, is not sufficient. It should be a combination of at least jti + iss + exp. The jti maximum length is not define and it could be a string or integer, not limited to alphanumeric.

Point 2 of the problem statements is related to a "trusted" aggregator using error codes to decode behaviours of the number verification flow.

Examples:

If this endpoint response successfully with a redirect 302, the SIM card belongs to this operator.
If this endpoint response with an error, the SIM card may not belong to this operator.

In terms of security, it would be better to depend on an internal generated token, as the steps to verify it will be shorter.

garciasolero commented 4 months ago

@mhfoo,

Yes, you are right. In my comment, I focused on a few claims. Obviously, it should also include the most typical ones: iss, aud, or iat.

Regarding point 2, in the aggregation model for number verification, the entity responsible for routing the request to the correct operator is the aggregator based on the IP address of the request using the TelcoRouting service. Therefore, it should not be the case that the SIM card does not belong to the operator, and if it were to happen, it would be an exceptional case related to synchronisation issues in the TelcoFinder translation tables.

ab-ip commented 4 months ago

@garciasolero

Is there any specification of TelcoRouting service? Who will be responsible for it in terms of maintenance? Will every Telco expose API for exporting Public IP Pools records or what is the idea?

Our current practice is that every Telco shares us IP Pools so we are able to do the routing on our side locally very easy and fast. Like using GeoIP Database with Telco resolution level. On that way there is no need for any additional external call which slow down the flow.

mhfoo commented 4 months ago

@ab-ip This question should be handled in GSMA Open Gateway

mhfoo commented 4 months ago

@garciasolero

Point 2 in this problem description is not about routing. Please refer to the following diagram for illustration of this problem statement.

Number Verify Issue drawio

In the above diagram, assume the Operators are standalone (not federated).

The aggregator:

has all the /authorize endpoints for Operator A, B, C,
has a SDK deployed in the mobile application which can query a list of provided /authorize endpoints.

From this setup, the aggregator can determine the operator of the SIM card which is providing the mobile data connection, when the correct /authorize endpoint returns a 302 Location with an auth code in the query string. The aggregator/mobile application does not need to continue with the auth code flow for number verify to determine the mobile number. The aggregator just needs to know the operator of the SIM card which the end customer is using.

The above usage of the /authorize endpoint should be prevented.

ab-ip commented 4 months ago

@mhfoo

The case you mentioned is too expensive for Clients/Aggregators to use Trial-and-Error method to resolve Telco. It takes to much time and increases flow duration, which is something nobody wants.

There are many other (easier and faster) ways right now to do this:

Clients can use IP to ASN resolution to find the Telco user belongs to.
In case of SDK it is also available to read MCCMNC of the SIM.

That is why TelcoRouting service based on IP address is mandatory. Telco Public IP Pools should be publicly available and shared with Clients and Aggregators. The best approach will be to have separate API (under CAMARA) for export/synchronization of IP Pools, so that Clients and Aggregators can do synchronization of IP Pools with all Telcos in a unique way.

If you want to prevent Clients/Aggregators not to misuse /auth endpoint you can simple apply business model which will define ratio between /auth endpoint (Auth Code result) and other CAMARA API endpoints. Eg. if there is more then 20% of Auth Code result then API calls, then Client needs to pay some fee for that gap.

garciasolero commented 4 months ago

I would like you to explain the request #9 in the sequence diagram of your solution. It appears that an Authorization header needs to be sent in the OAuth code flow, but that is not specified in the standard. I would also like to know how the sequence diagram would look if the SIM does not belong to the operator.

In the current solution, using an endpoint restricted by signed requests according to RFC9101, if the aggregator were to use authorization as a method to determine whether a SIM belongs to an operator (at no cost in order to reduce requests to TelcoFinder), the operator could detect abuse and deactivate their credentials.

mhfoo commented 3 months ago

@ab-ip

That is why TelcoRouting service based on IP address is mandatory. Telco Public IP Pools should be publicly available and shared with Clients and Aggregators. The best approach will be to have separate API (under CAMARA) for export/synchronization of IP Pools, so that Clients and Aggregators can do synchronization of IP Pools with all Telcos in a unique way.

See https://github.com/GSMA-Open-Gateway/Open-Gateway-Documents/blob/main/code/API_definitions/routing-api.yaml This is under the Open Gateway Repo. This API provides both the static and dynamic routes; shared between telco finders of operators and aggregators. Chapter 5 in the Open Gateway Repo has more details of the Telco Finder APIs.

mhfoo commented 3 months ago

@garciasolero

I would like you to explain the request #9 in the sequence diagram of your solution. It appears that an Authorization header needs to be sent in the OAuth code flow, but that is not specified in the standard.

Agreed to drop this method as shown in #9. The signed request according to RFC9101 will be a better solution.

I would also like to know how the sequence diagram would look if the SIM does not belong to the operator.

Are you referring to any of the following scenarios: a) federated scenario, route to partner operator or b) when the telco finder is unable to resolve the source Internet facing IP address?

garciasolero commented 3 months ago

Are you referring to any of the following scenarios: a) federated scenario, route to partner operator or b) when the telco finder is unable to resolve the source Internet facing IP address?

I am referring to what the flow would be if an operator received a request from a SIM that does not belong to it, regardless of whether there is an aggregator in front of it or not.

mhfoo commented 3 months ago

@garciasolero

I am referring to what the flow would be if an operator received a request from a SIM that does not belong to it, regardless of whether there is an aggregator in front of it or not.

Federated Architecture-Network Auth Service drawio

For the standalone Open Gateway, currently there is no route from the Internet to reach /authorize endpoint. Only the operator's SIM card can reach the internal /authorise endpoint. Blue dash path.

In the federated model, the Internet facing /authorize endpoint should only accept requests from partner / federated operators' CGNAT Internet IP address. A) We may assess this security posture of whitelisting partner CGNAT IP addresses and block the rest of the traffic. Pink dash path.

During the client onboarding process, the client will provide a redirect_uri. B) If the /authorize endpoint observed source IP address (from a CGNAT) cannot be resolved in the operator's Telco Finder, the response will be the failure redirect_uri (with a newly defined error code), following OIDC Authorization Code Flow and OAuth 2.0 This is a graceful failure for the client.

What are your thoughts on this?

mhfoo commented 3 months ago

Provided a diagram for review NumberVerification issues 93

@garciasolero @jpengar

AxelNennker commented 2 months ago

@ab-ip

Is there any specification of TelcoRouting service? Who will be responsible for it in terms of maintenance? Will every Telco expose API for exporting Public IP Pools records or what is the idea?

Our current practice is that every Telco shares us IP Pools so we are able to do the routing on our side locally very easy and fast. Like using GeoIP Database with Telco resolution level. On that way there is no need for any additional external call which slow down the flow.

Please consider +1 this Android feature request, which would allow client's mobile applications to get the operator's openid configuration https://issuetracker.google.com/issues/308240647

ab-ip commented 2 months ago

@AxelNennker +1 vote from my side. This feature (API) is fine and useful, but unfortunately it won't solve Web use cases (but only App) and coverage issue will stay as well. Having(knowing) OIDC Auth Endpoints does not mean that user can reach endpoint and be identified at the specific moment. When we talk about Seamless Auth the crucial part (for Service Providers/App Devs) is to know that it is possible to handle (resolve/identify) specific user. In order to determinate this possibility it is required to know that user is on Cellular(Mobile) Connection of specific (supported) Carrier. The best and easiest way to do this is definitely by IP address resolution. You can check how IPification do it for last 4 years with many partners (including 60+ Carriers and 100+ Service Providers)

AxelNennker commented 2 months ago

Thanks for the +1.

I agree that the proposed API does not solve all problems but if Android provides then others might follow. For the web case we go for universal links and assetlinks. https://www.telekom.de/.well-known/assetlinks.json https://www.telekom.de/.well-known/apple-app-site-association Which enable app-clips and instant apps, and also app-web-credential-sharing. To get rid of ip2msisdn we have operatorTokens (best thing since sliced bread) defined in GSMA TS.43 https://www.gsma.com/newsroom/wp-content/uploads//TS.43-v11.0-Service-Entitlement-Configuration.pdf

No solution for everything or all use cases but we are getting there if MNO work together and build an ecosystem for all of us

mhfoo commented 2 months ago

@AxelNennker The following is the proposed text for PR https://github.com/camaraproject/IdentityAndConsentManagement/pull/121 under OIDC Authorization Code Flow section.

The OIDC Authorization Code Flow is defined in OpenID Connect and incorporates RFC 9101, Passing a Request Object by Value.

AxelNennker commented 2 months ago

I suggest to move this issue or it resolution to the next version because I think incorporating this now would open up a longer discussion and I would like to get PR #121 soon.

AxelNennker commented 3 weeks ago

Please add your comment whether we should mandate implementation of 9101 and that clients must us it

Elisabeth-Ericsson commented 3 weeks ago

There are different solutions described as options to address the DDOS attack of the authZ server.

Solution 1 (non standard): app server requests one-time token from authZ server (protected by API gateway throttling features), one-time token is used by device on auth code request
Solution 2: RFC 9101: pass request object by value in the authorize request of the auth code flow a. This requires that a key exchange happens upfront between the potential client device and the CSP, such that the client could use the private_key_jwt for singing.

Regarding solution 2: We think that RFC 9101 does not solve not DDOS attack issues. (even though it addresses confidentiality and integrity issues). A malicious client will still be able to send “signed” authcode requests. Ericsson does not support making RFC 9101 mandatory. RFC 9101 should not be made mandatory now.

Regarding solution 1: This is non-standard and thus is not (yet) endorsed by Ericsson.

We recommend traditional API end point protection solutions (e.g. a firewall) on top of the AuthZ server to protect the AuthZ server from being overloaded.

In addition: there was a prior discussion in NumberVerification workgroup about securing the authorization end point sating: "GSMA have just approved a spec "ASAC.01-Seamless Authenticator subsystem enhancement for TS.43 Operator Token" which also deals with the security of the authorize endpoint… It may be a good idea to align with that initiative." - and the suggestion to align to this document in ICM. So we should not make a different solution mandatory now.

garciasolero commented 3 weeks ago

@Elisabeth-Ericsson

The RFC9101 was never a proposal to address DDoS attacks. That specification was proposed to avoid processing requests that do not come from authenticated clients. In other words, it aims to mitigate the processing of fraudulent requests that could lead to increased costs, such as queries to TelcoFinder.

Similarly, one-time tokens do not solve the DDoS problem, as the request still reaches the API Gateway and must be processed to check whether the token has been used before or not.

jpengar commented 3 weeks ago

@Elisabeth-Ericsson

The RFC9101 was never a proposal to address DDoS attacks. That specification was proposed to avoid processing requests that do not come from authenticated clients. In other words, it aims to mitigate the processing of fraudulent requests that could lead to increased costs, such as queries to TelcoFinder.

Similarly, one-time tokens do not solve the DDoS problem, as the request still reaches the API Gateway and must be processed to check whether the token has been used before or not.

In addition to Fabio's comment, RFC9101was also proposed as a solution to avoid the potential PII leakage problem raised by Vonage in the past, as mentioned at https://github.com/camaraproject/IdentityAndConsentManagement/issues/138#issuecomment-2020195375.

In any case, as mentioned in the 5 June WG meeting, RFC9101 was proposed by Telefónica as a possible solution to some of the issues raised in #128 and #138 for consideration by the WG. As discussed, if the WG decides to support RFC9101, it would be necessary to mandate its implementation/use to both, clients and operators. Otherwise, if clients could choose not to use it, we would not be addressing the concerns raised here.

From Telefonica's point of view, we are fine with not mandating the implementation and support of RFC9101 for the time being. We propose to close this issue and leave RFC9101 support out of the scope of the next Fall24 meta-release. And revisit it in the future if necessary.

mhfoo commented 1 week ago

@jpengar Please have this discussion for the Spring 25 release instead. I am fine to drop it from the Fall 24 release.

jpengar commented 1 week ago

As agreed in the June 19 WG meeting call, we exclude RFC 9101 from the scope of the next meta release, also considering the time constraints. It is recommended that the issue be closed for now and reopened in the future if necessary.

camaraproject / IdentityAndConsentManagement

Proposal to protect the /authorize endpoint for the Authorization Code Flow (Auth Code Flow) - RFC9101 #128