brave / brave-browser

Brave browser for Android, iOS, Linux, macOS, Windows.
https://brave.com
Mozilla Public License 2.0
17.8k stars 2.33k forks source link

IPFS origins report wrong document.location.origin #13303

Open da2x opened 3 years ago

da2x commented 3 years ago

Description

Brave returns an unexpected document location.origin when browsing on pages through the ipfs: and ipns: protocol schemes. Other properties in the location object (e.g. .protocol) are effected too.

Steps to Reproduce

  1. Open any ipfs://hashhash or ipns://hashhash
  2. Open the developer tools console
  3. Execute document.location.origin

Expected result:

The function should return ipfs://hashhash.

Actual result:

It returns http://hashhash.ipfs.localhost:12345/

Reproduces how often:

Every time.

Brave version (brave://version info)

1.19.67

Version/Channel Information:

Beta only feature.

Miscellaneous Information:

This bug makes it more difficult to develop mixed HTTPS/IPFS websites as its more difficult to determine correctly what protocol scheme is being used. This might have unexpected security issues. The Chromium Extension API return the expected origin. However, some extensions get very confused when the tab.url and its document.location don’t match.

diracdeltas commented 3 years ago

cc @yrliou @bbondy

i think it's fine to have the apparent origin be ipfs:// as long as domain isolation is preserved (https://github.com/brave/brave-browser/issues/12367). ex: ipfs://foo and ipfs://bar should be treated as third parties according to the same rules as http://a.com and http://b.com.

re: what web features are accessible to an ipfs:// page, options are:

  1. treat it like remote HTTP, which means no access to web features which require a secure context (https://developer.mozilla.org/en-US/docs/Web/Security/Secure_Contexts/features_restricted_to_secure_contexts)
  2. treat it like local HTTP, which counts as a secure context (https://w3c.github.io/webappsec-secure-contexts/#is-origin-trustworthy)
  3. treat it like HTTPS, which also counts as a secure context but may have some unintended side effects, such as HTTP mixed content being blocked.

i think 2 makes the most sense since IPFS documents are served locally.

spylogsster commented 3 years ago

this is how it looks now: image @da2x @bbondy should we change host, hostname, href, protocol too?

bbondy commented 3 years ago

I think we need protocol updated too. @diracdeltas should hostname and host be just the CID? And href should be the ipfs path I believe

diracdeltas commented 3 years ago

if the origin changes, href, protocol, host, and hostname (possibly other properties) should change accordingly. i think host/hostname would be the CID, href should be the full URL, and protocol should be IPFS if we decide to do this.

i'm somewhat concerned that this will be need to be a bigger change for consistency. for instance document.referrer and possibly other APIs that return URLs. this could cause some security issues if not done comprehensively, see https://bravesoftware.slack.com/archives/C0145NAJP51/p1612550288046100?thread_ts=1612304914.041400&cid=C0145NAJP51.

in https://docs.google.com/document/d/156bzOF8qkieyouqnhIQzwJsCxtsTNtyVcOiClSEgeU0/edit?disco=AAAAHwJ3OFc @lidel mentioned that this change might not have much practical value, in which case i think we should hold off on it unless it's causing major webcompat or developer issues.

da2x commented 3 years ago

You can see the expected result using the URL object: new URL('ipfs://ipfshash/some-dir/file.ext#hellohash'):

URL Object
hash: "hellohash"
host: "ipfshash"
hostname: "ipfshash"
href: "ipfs://ipfshash/some-dir/file.ext#hellohash"
origin: "ipfs://ipfshash"
password: ""
pathname: "/some-dir/file.ext"
port: ""
protocol: "ipfs:"
search: ""
searchParams: URLSearchParams {  }
username: ""

Chrome/Brave notably says the origin is null whereas Firefox correctly returns the expected ipfs://ipfshash origin.

da2x commented 3 years ago

this could cause some security issues if not done comprehensively [&] should hold off on it unless it's causing major webcompat or developer issues.

That’s the current state, though. There are all sorts of places where the double origin may cause issues and confusion for developers:

bbondy commented 3 years ago

ipfs.localhost isn’t in the MPSL (nor is any of the built-in default IPFS–HTTP gateways!), so malicious things can more easily set cookies or access localStorage data from *.ipfs.localhost than from a random hash-based origins, which crucially, doesn’t share a second-level domain.

That's not true because of the work done here: https://github.com/brave/brave-core/pull/7046/files

We're looking into a different way of handling this to help address other points though.

bridiver commented 3 years ago

I'm not sure this is an issue we want to fix. The problem here that we also ran into in another issue is that hostnames are not case sensitive, but CID is. Right now ipfs/ipns are effectively treated like file urls and that preserves the case. I was originally in favor of making a change like this for other reasons, but I'm no longer sure it's a good idea. There are too many different places where the case is handled for host/origin and I don't see a safe and reliable way to make sure it's consistent across the codebase.

bridiver commented 3 years ago

I guess to be completely accurate DNS names are case insensitive, but in chromium that has translated to all hostnames being converted to lowercase and but there are multiple places this happens and not all of them provide access to the scheme to differentiate between ipfs and http/https

bridiver commented 3 years ago

and really since ipfs is "InterPlanetary File System", maybe it makes more sense to treat these like file urls except maybe for dnslink. It would have been much simpler if dnslink used a different scheme...

bridiver commented 3 years ago

having said all of that, http://hashhash.ipfs.localhost:12345/ is not what we want to return, but I think the expected result for document.location.origin should be an empty string or whatever it returns for a file url

lidel commented 3 years ago

Some notes that may be useful when it is time to address this:

bridiver commented 3 years ago

If every cid can be converted to case insensitive encoding then this would be more viable. @lidel I want to verify that you're saying CIDv1 is already case insensitive?

MicahZoltu commented 8 months ago

If every cid can be converted to case insensitive encoding then this would be more viable. @lidel I want to verify that you're saying CIDv1 is already case insensitive?

Yes. One of the primary purposes of CIDv1, and the reason there is an effort to deprecate CIDv0 in favor of CIDv1, is specifically that it is "all lowercase" so it can be used as a subdomain rather than being constrained to a path (which enables subdomain isolation between sites retrieved from a gateway).

lidel commented 8 months ago

Yes, every CID can be turned into CIDv1 in base32 or base36.

Technically, one can encode CIDv1 using different multibase, such as case-sensitive base64:

$ ipfs cid format -b base64url bafybeigdyrzt5sfp7udm7hu76uh7y26nf3efuylqabf3oclgtqy55fbzdi
uAXASIMPEcz7Ir_0Gz56f9Q_8a80uyFphcABLtwlmnDHelDka

But on the web, the default convention is to use base32 for /ipfs/ and base36 for /ipns/. This ensures CID is case-insensitive but also short enough to fit in a single DNS label (which has max length limit of 63), allowing use on subdomain gateways.