Closed bschorchit closed 1 year ago
Update from @seaona : after trying several times, one time the popup didn't open - no error neither. I could also see SIWE screen too in other trials and another thing I see is that on the URI it's set https://localhost:9011/ but I am using http.
@bschorchit is there an ETA on this issue?
@bschorchit - I opened a pr for a fix
Thanks for that, @yonigoldberg ! We rolled out a hotfix (v10.26.2) yesterday. Is the current behavior on 10.26.2 on localhost still an issue for you?
It is a different issue now. Now it requires to approve localhost as a trusted domain
@digiwand @jpuri @danjm - This is related to your fix in: https://github.com/MetaMask/metamask-extension/pull/18200 Do you mind reviewing https://github.com/MetaMask/metamask-extension/pull/18296?
Thanks @digiwand, I added the test that you mentioned here (I can't comment on the PR due to lack of permissions).
Thanks @yonigoldberg! LGTM! I'm adding this to our internal list of PR reviews. will try to get this in the next release
I didn't realize you can't comment in the PR. To confirm, you could comment on the main thread, but not an individual comment?
Thanks @digiwand - yheaa I can't comment there.
[Sadly I can not comment on my PR] https://github.com/MetaMask/metamask-extension/pull/18296#discussion_r1150684088 @naugtur - To note that per the spec of SIWE, port is optional. How would you approach this? https://eips.ethereum.org/EIPS/eip-4361#abnf-message-format
domain = authority
; From RFC 3986:
; authority = [ userinfo "@" ] host [ ":" port ]
; See RFC 3986 for the fully contextualized
; definition of "authority".
@yonigoldberg I don't have full context so I'm making assumptions from my webapp security background. This might be wrong. That said, I think port being optional might mean it defaults to 80?
I'll take a closer look. Or maybe someone more familiar with the Origin guarantees in SIWE can chime in.
My participation is just the kneejerk reflex when I see protocol/port being omited on something called origin.
Fair point.
@yonigoldberg the PR is unblocked now and is available to comment now. I'm up for merging in @naugtur's suggestion if that sounds good with you too
edit: see below comments ~we can loosen restrictions later if we see a use-case for ports on remote origins. For now, unblocking ports on localhost seems useful and matches the issue case. Should remote origins with mismatched ports present themselves, they will not be blocked and only be presented with a warning.~
curious to learn more about the port use cases here and why this may or may not be a concern outside of localhost.
Summarizing relevant parts of EIP-4361:
domain
is the RFC 3986 authority that is requesting the signing.uri
is an RFC 3986 URI referring to the resource that is the subject of the signing (as in the subject of a claim).
; ABNF
domain = authority
; From RFC 3986:
; authority = [ userinfo "@" ] host [ ":" port ]
; See RFC 3986 for the fully contextualized
; definition of "authority".
uri = URI
; See RFC 3986 for the definition of "URI".
(as noted above, port
is optional)
Creating sessions
- Sessions MUST be bound to the address and not to further resolved resources that can change.
Verifying domain binding
- Wallet implementers MUST prevent phishing attacks by matching on the
domain
term when processing a signing request. For example, when processing the message beginning with"service.invalid wants you to sign in..."
, the wallet checks that the request actually originated fromservice.invalid
.- The domain SHOULD be read from a trusted data source such as the browser window or over WalletConnect (ERC-1328) sessions for comparison against the signing message contents.
Out of Scope
The following concerns are out of scope for this version of the specification to define: [...]
- Authorization to server resources.
- Interpretation of the URIs in the resources term as claims or other resources.
- The specific mechanisms to ensure domain-binding.
Security Considerations: Wallet and relying party combined security
Both the wallet and relying party have to implement this specification for improved security to the end user. Specifically, the wallet MUST confirm that the message is for the correct
domain
or provide the user means to do so manually (such as instructions to visually confirming the correct domain in a TLS-protected website prior to connecting via QR code or deeplink), otherwise the user is subject to phishing attacks.Verification of domain binding
- Wallets MUST check that the
domain
matches the actual signing request source.- This value SHOULD be checked against a trusted data source such as the browser window or over another protocol.
My interpretation of the above:
uri
, domain
, and resources
. So only domain
is relevant for this discussion.port
and userinfo
are omitted from domain
, it seems reasonable that these are "wildcarded" and omitted from the domain binding check. So if domain: localhost
, then any port on localhost
would match that.
port
or userinfo
is specified on domain
in the message, then they must be enforced, regardless of the domain in question
So for #18296, my takeaway would be:
localhost
port
and userinfo
validation iff respective part is present in domain
field of SIWE message. If not, only check the domain parts supplied in the domain
field.This allows the authority to decide "how wide their identity" is. Local dev/testing dapps can specify localhost
and accept everything from the localhost origin, while a trusted self-hosted service accessed via localhost
would require this protection to be kept, and not restricting a specified port here would be unexpected behavior.
Sessions MUST be bound to the address and not to further resolved resources that can change.
tells us that port changes should not invalidate the session.
But other than that, being stricter and having checks for domain validation that are not part of the spec should also be non-violating. So in general it should be OK for MetaMask to add its own restrictions around "domain binding" (which may make special cases for localhost
uri/domain/resourced), as long as those are documented clearly.
ERC-4361, by referencing authority
from RFC-3986 is implicitly dragging the concept of URI scheme
along with it as authority
is defined within the scope of a URI scheme
.
For the purpose of RFC-3986 authority
used in domain
of ERC-4361, how is the URI scheme
defined?
It is the URI scheme
that defines the default port value, so it's necessary for the interpretation of the domain that doesn't contain the :port
.
eg. If we assume http
as the URI scheme, lack of :port
can be interpreted to mean :80
or empty. the RFC-3986 itself is not clear about it and we'd have to reach for the http spec, where (I assume from prior experience) the interpretation ends up being :80
not empty.
ERC-4361 would have to define domain
as RFC-3986 authority
with a URI scheme
that doesn't specify a default port to allow ignoring the port under all circumstances.
I think the lack of a field defining the URI scheme
in ERC-4361 spec may be considered as it being implicitly declared void. The domain is coming from a http or https scheme though.
also, ERC-4361 "Out of scope" section says:
The specific mechanisms to ensure domain-binding.
In the current shape of ERC-4361 the burden of defining what URI scheme
to consider for checking that the domain
matches the actual signing request source falls on said specific mechanisms
.
Unless ERC-4361 is updated to explicitly define URI scheme
for authority
used as domain
, it's on us to decide how to interpret the missing port.
Obligation 1:
the wallet MUST confirm that the message is for the correct
domain
or provide the user means to do so manually (such as instructions to visually confirming the correct domain in a TLS-protected website prior to connecting via QR code or deeplink), otherwise the user is subject to phishing attacks.
Obligation 2:
Wallets MUST check that the
domain
matches the actual signing request source. This value SHOULD be checked against a trusted data source such as the browser window or over another protocol.
What should we do in these situations?
Situation 1A: A page hosted at https://evil.com
requests SIWE signature with domain
set to example.com
.
Situation 1B: A page hosted at https://3eda758fc11c.cloudfront.net
requests SIWE signature with domain
set to dd5693b54fe3.cloudfront.net
.
Situation 1C: A page hosted at https://example.com
requests SIWE signature with domain
set to example.com
or example.com:443
.
Situation 2A: A local privileged service is operating on http://localhost:80
which requires root privileges to bind to. The user runs npm install in a project and one of the dependencies uses a malicious install script to start a server on http://localhost:1337
and later confuses/phishes the user to go to that address. It requests SIWE signature with domain
set to localhost
as usual.
Situation 2B: A builder wants to test their application and the test setup runs it on different ports on localhost
each time and it's inconvenient to add code to detect that before creating the domain
for the SIWE request, so domain
is set to localhost
Situation 2C: A local privileged service is operating on http://localhost:80
which requires root privileges to bind to. It requests SIWE signature with domain
set to localhost
.
Context: A hosting provider offers public IP4 and free domains, but each user on the shared hosting gets just 5 random ports.
Situation 3A: A phishing page posing as usercontent.hosting.com:1973 hosted at http://usercontent.hosting.com:1337
requests SIWE signature with domain
set to usercontent.hosting.com
.
Situation 3B: A phishing page posing as usercontent.hosting.com:1973 hosted at http://usercontent.hosting.com:1337
requests SIWE signature with domain
set to usercontent.hosting.com:1973
.
Situation 3A: A phishing page posing as usercontent.hosting.com:1973 hosted at http://usercontent.hosting.com:1337
requests SIWE signature with domain
set to usercontent.hosting.com:1337
. The user doesn't remember the original port number anyway.
Obligation 2 says we must check the domain
against a trusted data source - in our case that's the Origin of the browser Realm (often the current tab). It doesn't say how strict that check needs to be before we defer to the user as per Obligation 1.
Obligation 1 suggests there's a point at which we sould no longer try to block the request and let the user make the decision while providing enough context for them to understand their choice.
Note that we cannot use the presence of the port in domain
as the only indicator whether the port needs to be verified, because domain
is defined by the malicious actor in an attack. (2A, 3A)
To address situations 2A and 3A without relying on users we would need to implement strict checking of the port and if domain
omits the port number, assume default for the protocol that we know from the Origin.
Depending on how much we rely on the user's decision, these are possible behaviors to implement:
❌ - mismatch error, API returns an error, the user never get exposed to that UI. ⚠️ - show a prominent warning message + the second step to confirm. 👍 - match, ask user - regular SIWE screen. (updated according to Antonela)
Port checking behaviors in situations above:
1. strict check | 2. optional check | 3. strict unless localhost | 4. warn only | |
---|---|---|---|---|
1A | ❌ | ❌ | ❌ | ❌ |
1B | ❌ | ❌ | ❌ | ⚠️😱 |
1C | 👍 | 👍 | 👍 | 👍 |
2A | ❌ | 👍 | 👍 | ⚠️ |
2B | ❌ | 👍 | 👍 | ⚠️ |
2C | 👍 | 👍 | 👍 | 👍 |
3A | ❌ | 👍 | ❌ | ⚠️ |
3B | ❌ | ❌ | ❌ | ⚠️ |
3C | 👍 | 👍 | 👍 | 👍 |
This section includes author's opinions.
We need to decide between implementing strict checking with all of the friction it introduces or a less strict behavior that's more desirable while requiring user to decide and UI to help them make the right decision.
domain
Ensuring that UI is helping the user understand port mismatch to make up for the security tradeoff and making port checking optional with behavior 2 (optional check) seems the most promising outcome for this issue.
Without changes to the UI behavior 2 may be risky and we can only go as far as behavior 3 or 4.
Thanks to @legobeat for convincing me to do a deep dive into ERC,RFC,etc.
cc @digiwand and @holantonela for thoughts on UI more competent than mine.
Thanks for this analysis @naugtur! I think it makes sense to use the patters that we already have in place but reflecting the heuristics presented in your table:
❌ - mismatch error, prevent signing ---> API returns an error, the user never get exposed to that UI.
⚠️ - show a prominent scary warning before proceeding to signature ---> We use the warning message + the second step to confirm.
👍 - match, ask user for signature --> regular SIWE screen.
Related issues are gathered in #18471. The key insight in @naugtur's facts-summary which was missing from my preceding comment is the part where default ports are implied depending on scheme in RFC-3986.
Looking at opened issues, a popular interpretation seems to be that no explicit port in (as per ABNF scheme-less) domain
part of the message should be interpreted as "wildcard" and be matchable for origins with any port as long as the hostname is matching. But there is a case to be made for a more conservative reading where the missing scheme
is assumed from the actual origin and taking the stance that "no port = enforce default for whatever transport is actually used on the origin".
As noted there are some UX subtleties arising around how this is communicated regardless of which choice is made here.
https://github.com/MetaMask/core/pull/1163 has been updated according to the above.
After sleeping on it I now think the most secure option for the end user would be to:
One argument against enforcing https by default is that https in the browser introduces reliance on centralized authority and not all SIWE use cases might want that.
@legobeat and @naugtur - great analysis of the specs. Thank you 🙏
I am also convinced that the optional port should only be allowed when it is not provided. Otherwise, it should be enforced. With that, I am going to close the PR. We have updated Dynamic.xyz to always provide and validate the port in our SIWE.
@naugtur - one exception about enforcing https is localhost
. From a developer standpoint, it should be allowed to use http
on localhost.
many thanks @legobeat and @naugtur!
These examples are useful and I'd like to propose a couple more examples. I agree with most of them except I have a concern with 1A. (sorry, I know you said "should not need further discussion." but I'm going for it :D) @naugtur
I see ❌ now under the "warn only" session. Was this meant to be ⚠️😱? While we do not support a developer-mode setting, I don't think we should block this case entirely.
One scenario for the mismatched domain was explained in the related ticket #17707.
from @HeyFloM 's usecase https://github.com/MetaMask/metamask-extension/issues/17707#issuecomment-1470877080:
On our side, this is problematic for all integrations of Metamask with Unity WebGL games/metaverse. Why? Because all WebGL builds needs to be hosted on specific hosting services like itch.io so we can easily test them without having to upload each build on our Amazon S3 Buckets. So each development build have this error right now image
they did figure out a solution for production, but it continues to cause friction for development:
After identifying the issue, we aligned the URL of the domain hosting the production Unity build with the server URL that manages the Unity SDK and Web3 authentication (Moralis Unity SDK self-hosted server). As a result, there are no longer any problems with the production part.
However, the impact is still present on the development side because we use different hosting services for speed and agility to export and test quickly. These services make it very easy to see what are the results of your latest dev changes in the build, but they are on another domain. We could create a test environment on our servers, but this would require our developers to:
- export the WebGl build from Unity
- Manually import it on our Amazon S3 buckets
- Convert it to react
- Align the different domain URLs manually
- And this at least 4 times per day, which would be a significant timeloss
[edited] Situation 1D: A page hosted at http://evil.com
requests SIWE signature with domain set to evil.com
.
Proposal: 1D | ❌ | ❌ | ❌ | 👍
Purpose: Evaluate non-TLS-protected website situation. Edited based on the comments below.
[maybe] Situation 1E: A page hosted at https://www.example.com
requests SIWE signature with domain set to https://example.com
.
Proposal: 1E | ❌ | ❌ | ❌ | ⚠️😱
Purpose: This would be to track that we have considered this case of the www
subdomain though I believe we should apply the same behavior as we do with other subdomains (Situation 1B); as www
is another subdomain itself. Like other subdomains, it's possible for the "www" subdomain to host different resources than the root domain. (Relates to https://github.com/MetaMask/metamask-extension/issues/18332)
Implementing strict mode alongside a developer mode setting sounds good to me.
If we enforce strict mode for any of these situations, we'll need to inform developers of this change and give them time to make updates if needed.
@digiwand My answer to the 1A
scenario where the developer wants an application served from an origin of cdn.example.com
to be able to claim a domain of dapp.example.com
is to either distribute appropriate TLS certificates for dapp.example.com
to the utilized CDN or introduce terminating TLS proxies, in either case to the effect of aligning domains. This not being circumventable is intentional. Same applies for 1E
.
On the other hand, I'd say for your 1D
: :+1: - Unless and until a new version EIP-4361 adds new semantics around scheme and transport - currently no TLS restrictions. This is with the differing take from here that "no scheme or port in domain
" -> "implied any scheme; no default port assumed; no restrictions on port" . IMO default behavior of restricting to TLS origins only could have been good but is not an option for the immediate-term since it constitutes a breaking change (with existing integrations made) and EIP violation.
I think there's a way forward which maintains safe and compliant defaults without causing hindrance for local dev or requiring an "unsafe dev mode" (completely disable domain-binding validation? maybe a feature that could be enabled only in flask and beta), it can still be considered as a plan b.
I think that makes sense. Distributing TLS certificates to the CDN will allow the appropriate domains to host those servers. Thanks for the advice @legobeat!
okay, I agree 1A no longer needs a change. I do still believe blocking the request entirely should be communicated before implementing this situation.
For 1A anything other than rejection would be against the spec. It's the only thing ERC is clearly requiting us to do. Note that it's not a subdomain issue but a totally different domain.
I'm ok with allowing http for localhost regardless of how strict we decide to be.
okay, that makes sense. maybe my communication was confusing. I think it's a good idea to block 1A once we have a developer setting. Even more so with the setup @legobeat mentioned above.
I'm ok with allowing http for localhost regardless of how strict we decide to be.
@naugtur when you mentioned allowing these cases, were you thinking with or without the warning modal?
Just read the updated comment now @legobeat and responding to 1D comment
re: 1D. okay, I see I missed this earlier. I agree and updated the comment with 👍. Snippet from EIP-4361 spec:
The following concerns are out of scope for this version of the specification to define: ... Protocols for use without TLS connections.
Good idea to include a specific setting to disable domain-binding in Flash/beta as a plan B. I've linked this comment in the PR to implement the developer mode setting https://github.com/MetaMask/metamask-extension/issues/18191#issuecomment-1503418941
p.s
@bschorchit I think this issue should be fixed in #18518 , which is merged to develop
~(but did not make the cut for release for some reason so not sure when it's schedule do go out)~
There are some hypothetical further improvements/changes mentioned in the thread - anything we want to distill to new issue(s) or we close this off once #18529 makes it to release?
I've just checked and https://github.com/MetaMask/metamask-extension/pull/18518 is part of 10.31 release - which should be rolling out soon (next week I think) to users.
I don't think we need to wait to ship https://github.com/MetaMask/metamask-extension/issues/18529 before closing this off. We can track and discuss anything related to planning/communicating and blocking mismatched domains directly on that issue (and other related issues) and come back to this one for reference even if closed.
Oh, I somehow missed that it was indeed part of the release - that's great and sounds good to me - thanks @bschorchit.
Describe the bug
This was originally reported to us here on Slack and here https://github.com/MetaMask/metamask-extension/issues/17707#issuecomment-1470728017.
This is not an issue if you don't append http:// to localhost in the URI in the message (see recording below).
Steps to reproduce
Here is a example of message to sign that was reported to cause this while on localhost : https://github.com/MetaMask/metamask-extension/issues/17707#issuecomment-1470786260
Error messages or log output
No response
Version
10.26.1
Build type
None
Browser
Other (please elaborate in the "Additional Context" section)
Operating system
Other (please elaborate in the "Additional Context" section)
Hardware wallet
No response
Additional context
Not sure on browser and OS info.
Recording: https://user-images.githubusercontent.com/34968356/225613863-783e9066-d1a4-4626-b5be-c358e31818ab.mp4