ipfs / gateway-conformance

a vendor-agnostic gateway conformance test suite for implementers of IPFS Gateways to ensure compliance with https://specs.ipfs.tech/http-gateways/
https://specs.ipfs.tech/http-gateways/
Other
14 stars 14 forks source link

bug: gateway-conformance subdomains are inflexible and setup is not clear #185

Closed SgtPooki closed 4 months ago

SgtPooki commented 11 months ago

I'm trying to improve on conformance of helia-http-gateway, and I am continuously getting failures from the gateway-conformance tool (via docker and directly with go run ./cmd/gateway-conformance/main.go test # ...)

I believe these may be related to whatever proxying thing is happening under the hood, because i've seen this error more than a few times:

Error: Querying http://localhost/ipfs/bafkreicysg23kiwv34eg2d7qweipxwosdo2py4ldv42nbauguluen5v6am/ failed: Get "http://localhost/ipfs/bafkreicysg23kiwv34eg2d7qweipxwosdo2py4ldv42nbauguluen5v6am/": proxyconnect tcp: dial tcp [::1]:8090: connect: connection refused

Some questions:

  1. If I am specifying --gateway-url 'http://localhost:8090' --subdomain-url 'http://localhost:8090', why is the gateway-conformance tool querying localhost with no port?
  2. Why are dials going to an ip6 address (that is not supported on my machine) and can I disable that?
  3. Can we disable the proxy entirely? is there a way to run tests directly against a gateway and verify the actual HTTP headers and responses?
SgtPooki commented 11 months ago

I also get the following error quite frequently:

Error: Querying http://localhost/ipfs/bafkreicysg23kiwv34eg2d7qweipxwosdo2py4ldv42nbauguluen5v6am/ failed: Get "http://localhost/ipfs/bafkreicysg23kiwv34eg2d7qweipxwosdo2py4ldv42nbauguluen5v6am/": proxyconnect tcp: unexpected EOF

but this is very odd because the request works perfectly fine in a browser:

image

SgtPooki commented 11 months ago

Running the test command with ENV vars instead seems to work...?

SUBDOMAIN_GATEWAY_URL="http://localhost:8090" GATEWAY_URL="http://localhost:8090" go run ./cmd/gateway-conformance/main.go test --json gwc-report.json --verbose -- -timeout 30m -run 'TestGatewaySubdomains/request_for_example.com%2Fipfs%2F%7BCIDv1%7D_redirects_to_subdomain_%28HTTP_proxy_tunneling_via_CONNECT%29#01'
SgtPooki commented 11 months ago

Running the test command with ENV vars instead seems to work...?

SUBDOMAIN_GATEWAY_URL="http://localhost:8090" GATEWAY_URL="http://localhost:8090" go run ./cmd/gateway-conformance/main.go test --json gwc-report.json --verbose -- -timeout 30m -run 'TestGatewaySubdomains/request_for_example.com%2Fipfs%2F%7BCIDv1%7D_redirects_to_subdomain_%28HTTP_proxy_tunneling_via_CONNECT%29#01'

but not when run via docker with

docker run --network host -v $PWD:/workspace -w /workspace --env SUBDOMAIN_GATEWAY_URL='http://localhost:8090' --env GATEWAY_URL='http://localhost:8090' ghcr.io/ipfs/gateway-conformance:v0.4.2 test --verbose --json gwc-report.json --specs subdomain-ipns-gateway,subdomain-ipfs-gateway -- -timeout 30m -run TestGatewaySubdomains/request_for_example.com%2Fipfs%2F%7BCIDv1%7D_redirects_to_subdomain_%28HTTP_proxy_tunneling_via_CONNECT%29#01
laurentsenta commented 11 months ago
  1. if I am specifying --gateway-url 'http://localhost:8090' --subdomain-url 'http://localhost:8090', why is the gateway-conformance tool querying localhost with no port?

Looks like a bug, we haven't tested gateways behind subdomains with ports. We'd need to update the subdomain tests to keep the port as well (cc @lidel).

Example of a test that loses the port. The localhost tests might require a light refactor to force the port somehow (code)

  1. Why are dials going to an ip6 address (that is not supported on my machine) and can I disable that?

The gateway-url is the address the test suite connects to. In your case, the test suite always connects to localhost:8090. I'm surprised the test suite or golang would mess with the hostname mapping.

Could you double-check your /etc/hosts for a mapping ::1 localhost?

  1. Can we disable the proxy entirely? is there a way to run tests directly against a gateway and verify the actual HTTP headers and responses?

So the subdomain-url is exclusively used to test the subdomain specs. But the test suite ONLY connects to the gateway-url (localhost:8090). The subdomain is "simulated". More details in the docs.

What do you mean by disabling the proxy entirely? My recommendation would be to NOT test the subdomain specs, maybe get started with the simpler ones.

For example, use --specs -subdomain-gateway,-dnslink-gateway to test everything but DNS Link and subdomains (they are tricky to setup). Or use --specs trustless-gateway to start from the trustless specs (should be straightforward).

More examples in the docs

SgtPooki commented 11 months ago

Could you double-check your /etc/hosts for a mapping ::1 localhost?

good call:

::1 localhost

and thanks for the rest of the links, super helpful. figuring out how to filter out tests right now is a lot of the trouble, but running tests from vscode is making determining those much easier

lidel commented 11 months ago

I see how having two URLs options here becomes confusing:

--gateway-url 'http://localhost:8090' --subdomain-url 'http://localhost:8090'

Perhaps an useful insight is that subdomain gateway is not related to request's URL. It is a feature that looks at Host HTTP header from the request, nothing more. If there is no Host, then even if request was sent to localhost:8080, it won't be processed as a subdomain request.

I took a stab at improving docs in https://github.com/ipfs/gateway-conformance/pull/193

Here, it should be something like: --gateway-url 'http://127.0.0.0.1:8090' --subdomain-url 'http://example.com' just to avoid confusion with localhost.

@SgtPooki try following prior art used for testing kubo, such as

https://github.com/ipfs/gateway-conformance/blob/4760ba747731910c0b7b702a6d089fd3e932ab74/.github/workflows/test-kubo-e2e.yml#L65-L66

With this, all you need to do is to make sure your gateway implementation has subdomain gateway enabled on example.com (and/or localhost) hostnames like we do in kubo – this way you don't need to pass --subdomain-url 'http://example.com' at all.

And I agree with @laurentsenta, will be easier to disable subdomain-gateway until you pass regular path-gateway first.

Digression on why we see "localhost without port in tests"

Gateway feature could be enabled per origin (DNS name+port+protocol), but in real world it is easier to implement and reason about enabling this feature globally per DNS name.

For example, in Kubo, subdomain gateway is implicitly enabled on localhost DNS name (no matter what port).

This is also why we have tests that seem to ignore port in the Host header, but things still work.

Yes, this could be fixed, but it also is cosmetic, and only impacts localhost case (production URLs use implicit port, not present in base URL)

SgtPooki commented 5 months ago

@lidel lack of port support is making gateway-conformance support in verified-fetch a pita.

"real world" usecase is developers testing their servers in CI, which will likely not be on implicit ports.

lidel commented 5 months ago

@SgtPooki let's reopen and discuss it here instead of slack.

It has been a while, could you describe what exactly gets painful in your setup what would be the preferred fix?

I looked at https://github.com/ipfs/helia-verified-fetch/pull/85/files#diff-175730b97b315afceee1b708f9bd876588ef883662dc52e139a7961ac7804c56R100 and would being able to pass --gateway-url 'http://127.0.0.0.1:8090' --subdomain-url 'http://example.com:8090', and have conformance tests respecting the passed port in their Host header asserts origins? Or did I miss the point?

SgtPooki commented 5 months ago

have conformance tests respecting the passed port in their Host header asserts origins? Or did I miss the point?

this would fix my concerns for sure. and remove the need for the header munging with fixingGwcAnnoyance

@lidel if you have a recommendation on where to look for fixing this i'd be happy to put out a PR. https://github.com/ipfs/gateway-conformance/blob/c0dafe17ad01f112530ab788caef4a58483847ab/tooling/helpers/subdomain.go#L45 seems to be the standard way to get the host, but we should probably be using .Hostname or similar. https://pkg.go.dev/net/url#URL.Hostname implies that .Host is supposed to return port number, and that Hostname removes it.. but navigating go docs & code is not simple for me yet. If you could point me in the right direction I could take this on

SgtPooki commented 5 months ago

diving in a bit more, u = https://pkg.go.dev/net/url#URL, which has .Host key that should contain the port number.. why doesn't it?

SgtPooki commented 5 months ago

it seems like this is the problem:

  1. hardcoded localhost: https://github.com/ipfs/gateway-conformance/blob/c0dafe17ad01f112530ab788caef4a58483847ab/tooling/test/config.go#L32
  2. consumed by https://github.com/ipfs/gateway-conformance/blob/c0dafe17ad01f112530ab788caef4a58483847ab/tests/subdomain_gateway_ipfs_test.go#L22-L26

This seems like an implicit dependence on kubo's "accept all things on localhost" requirement, which will break in any gateway testing where that's not true/possible.


Is the fix just removing SubdomainLocalhostGatewayURL from the tested URLs, making SubdomainGatewayURL (and consequently --subdomain-url '<actualSubdomainIwantYouToTest>') a requirement for running subdomain tests?

lidel commented 5 months ago

Thanks for digging into this. I think we should fix the conformance tests to run test against explicit endpoint, and yes, localhost tests should not be hardcoded.

Your pointers are very useful, I'll see if I can open a PR later today/tomorrow.

lidel commented 5 months ago

Spent some time spelunking and thinking how to clean this up.

Just like you noted, the underlying problem is not really Kubo specific, but the way gateway-conformance has hardcoded subdomain tests to always run against both SUBDOMAIN_GATEWAY_URL and localhost.

It was likely a quick&dirty way to ensure we keep testing localhost ipfs-desktop/companion and brave use cases, but now it should be explicit rather implicit.


I have local PR that cleans this up, but I've noticed additional problems, so will take a bit longer than expected to finish all loose ends. For now I will be tracking progress here, and open PR once ready. We will likely have to release it as v6, due to breaking changes.

Cleanup progress