ipshipyard / waterworks-community

Discussion and documentation concerning the operation of the IPFS HTTP Gateway at https://ipfs.io/ipfs.
MIT License
0 stars 0 forks source link

Public gateway doesn't seem to respect accept header for converting dag-cbor to dag-json #14

Closed 2color closed 3 months ago

2color commented 3 months ago

Background

Details

What doesn't work requesting a dag-cbor CID (bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq) from the ipfs.io gateway with the header Accept: vnd.ipld.dag-json and Accept: application/vnd.ipld.dag-json the response is still dag-cbor binary

➜  helia-verified-fetch git:(fix-readme) curl -i -H "Accept: vnd.ipld.dag-json" https://ipfs.io/ipfs/bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq/
HTTP/2 200
server: openresty
date: Thu, 29 Feb 2024 12:05:31 GMT
content-type: application/vnd.ipld.dag-cbor
content-length: 238
access-control-allow-headers: Content-Type
access-control-allow-headers: Range
access-control-allow-headers: User-Agent
access-control-allow-headers: X-Requested-With
access-control-allow-methods: GET
access-control-allow-methods: HEAD
access-control-allow-methods: OPTIONS
access-control-allow-origin: *
access-control-expose-headers: Content-Length
access-control-expose-headers: Content-Range
access-control-expose-headers: X-Chunked-Output
access-control-expose-headers: X-Ipfs-Path
access-control-expose-headers: X-Ipfs-Roots
access-control-expose-headers: X-Stream-Output
cache-control: public, max-age=29030400, immutable
content-disposition: attachment; filename="bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq.cbor"; filename*=UTF-8''bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq.cbor
etag: "bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq.dag-cbor"
x-content-type-options: nosniff
x-ipfs-path: /ipfs/bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq/
x-ipfs-roots: bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq
x-ipfs-pop: ipfs-bank3-fr2
timing-allow-origin: *
x-ipfs-datasize: 238
x-ipfs-lb-pop: gateway-bank1-fr2
x-bfid: 62367318b5f28b497220b63743664407
strict-transport-security: max-age=31536000; includeSubDomains; preload
x-proxy-cache: HIT
accept-ranges: bytes

Warning: Binary output can mess up your terminal. Use "--output -" to tell
Warning: curl to output it to your terminal anyway, or consider "--output
Warning: <FILE>" to save to a file.

➜  helia-verified-fetch git:(fix-readme) curl -i -H "Accept: application/vnd.ipld.dag-json" https://ipfs.io/ipfs/bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq/
HTTP/2 200
server: openresty
date: Thu, 29 Feb 2024 12:10:29 GMT
content-type: application/vnd.ipld.dag-cbor
content-length: 238
access-control-allow-headers: Content-Type
access-control-allow-headers: Range
access-control-allow-headers: User-Agent
access-control-allow-headers: X-Requested-With
access-control-allow-methods: GET
access-control-allow-methods: HEAD
access-control-allow-methods: OPTIONS
access-control-allow-origin: *
access-control-expose-headers: Content-Length
access-control-expose-headers: Content-Range
access-control-expose-headers: X-Chunked-Output
access-control-expose-headers: X-Ipfs-Path
access-control-expose-headers: X-Ipfs-Roots
access-control-expose-headers: X-Stream-Output
cache-control: public, max-age=29030400, immutable
content-disposition: attachment; filename="bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq.cbor"; filename*=UTF-8''bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq.cbor
etag: "bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq.dag-cbor"
x-content-type-options: nosniff
x-ipfs-path: /ipfs/bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq/
x-ipfs-roots: bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq
x-ipfs-pop: ipfs-bank3-fr2
timing-allow-origin: *
x-ipfs-datasize: 238
x-ipfs-lb-pop: gateway-bank1-fr2
x-bfid: 1fb40729d5db8e471d3123879a8fd359
strict-transport-security: max-age=31536000; includeSubDomains; preload
x-proxy-cache: HIT
accept-ranges: bytes

Warning: Binary output can mess up your terminal. Use "--output -" to tell
Warning: curl to output it to your terminal anyway, or consider "--output
Warning: <FILE>" to save to a file.

What works: requesting the same dag-cbor CID (bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq) from the ipfs.io gateway with the the query ?format=dag-json works and returns json

➜  helia-verified-fetch git:(fix-readme) curl -i "https://ipfs.io/ipfs/bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq/?format=dag-json"
HTTP/2 200
server: openresty
date: Thu, 29 Feb 2024 12:04:25 GMT
content-type: application/vnd.ipld.dag-json
content-length: 382
access-control-allow-headers: Content-Type
access-control-allow-headers: Range
access-control-allow-headers: User-Agent
access-control-allow-headers: X-Requested-With
access-control-allow-methods: GET
access-control-allow-methods: HEAD
access-control-allow-methods: OPTIONS
access-control-allow-origin: *
access-control-expose-headers: Content-Length
access-control-expose-headers: Content-Range
access-control-expose-headers: X-Chunked-Output
access-control-expose-headers: X-Ipfs-Path
access-control-expose-headers: X-Ipfs-Roots
access-control-expose-headers: X-Stream-Output
cache-control: public, max-age=29030400, immutable
content-disposition: inline; filename="bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq.json"; filename*=UTF-8''bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq.json
etag: "bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq.dag-json"
x-content-type-options: nosniff
x-ipfs-path: /ipfs/bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq/
x-ipfs-roots: bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq
x-ipfs-pop: ipfs-bank3-fr2
timing-allow-origin: *
x-ipfs-datasize: 382
x-ipfs-lb-pop: gateway-bank1-fr2
x-bfid: 1021e02601421765e06eda7a342ace30
strict-transport-security: max-age=31536000; includeSubDomains; preload
x-proxy-cache: HIT

{"cats":"not cats","cheese":[{"/":"bafkreifvxooyaffa7gy5mhrb46lnpdom34jvf4r42mubf5efbodyvzeujq"},{"/":"bafkreifvxooyaffa7gy5mhrb46lnpdom34jvf4r42mubf5efbodyvzeujq"},{"/":"bafkreifvxooyaffa7gy5mhrb46lnpdom34jvf4r42mubf5efbodyvzeujq"},{"/":"bafkreifvxooyaffa7gy5mhrb46lnpdom34jvf4r42mubf5efbodyvzeujq"}],"something":{"/":"bafkreifvxooyaffa7gy5mhrb46lnpdom34jvf4r42mubf5efbodyvzeujq"}}⏎
hacdias commented 3 months ago

@2color I can't reproduce:

curl -i -H "Accept: application/vnd.ipld.dag-json" https://ipfs.io/ipfs/bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq
HTTP/2 200
server: openresty
date: Thu, 29 Feb 2024 12:24:14 GMT
content-type: application/vnd.ipld.dag-json
content-length: 382
access-control-allow-headers: Content-Type
access-control-allow-headers: Range
access-control-allow-headers: User-Agent
access-control-allow-headers: X-Requested-With
access-control-allow-methods: GET
access-control-allow-methods: HEAD
access-control-allow-methods: OPTIONS
access-control-allow-origin: *
access-control-expose-headers: Content-Length
access-control-expose-headers: Content-Range
access-control-expose-headers: X-Chunked-Output
access-control-expose-headers: X-Ipfs-Path
access-control-expose-headers: X-Ipfs-Roots
access-control-expose-headers: X-Stream-Output
cache-control: public, max-age=29030400, immutable
content-disposition: inline; filename="bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq.json"; filename*=UTF-8''bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq.json
etag: "bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq.dag-json"
x-content-type-options: nosniff
x-ipfs-path: /ipfs/bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq
x-ipfs-roots: bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq
x-ipfs-pop: ipfs-bank4-am6
timing-allow-origin: *
x-ipfs-datasize: 382
x-ipfs-lb-pop: gateway-bank1-am6
x-bfid: 8ca78e752d003ecb87cf4623398162aa
strict-transport-security: max-age=31536000; includeSubDomains; preload
x-proxy-cache: MISS

{"cats":"not cats","cheese":[{"/":"bafkreifvxooyaffa7gy5mhrb46lnpdom34jvf4r42mubf5efbodyvzeujq"},{"/":"bafkreifvxooyaffa7gy5mhrb46lnpdom34jvf4r42mubf5efbodyvzeujq"},{"/":"bafkreifvxooyaffa7gy5mhrb46lnpdom34jvf4r42mubf5efbodyvzeujq"},{"/":"bafkreifvxooyaffa7gy5mhrb46lnpdom34jvf4r42mubf5efbodyvzeujq"}],"something":{"/":"bafkreifvxooyaffa7gy5mhrb46lnpdom34jvf4r42mubf5efbodyvzeujq"}}

Also want to note that vnd.ipld.dag-json is invalid and not complete.

hacdias commented 3 months ago

Okay, I run the command multiple times. Sometimes I get CBOR, sometimes I get JSON. I would bet that it's something to do with the caching we have in front of rainbow.

2color commented 3 months ago

Thanks for testing this @hacdias

I'm now aware that vnd.ipld.dag-json is invalid 😄 .

Indeed, it seems to be related to the cache, as both my requests are cache HITs whereas yours (that worked correctly) was a MISS.

hacdias commented 3 months ago

I opened a PR: https://github.com/ipshipyard/waterworks-infra/pull/34

hacdias commented 3 months ago

@2color can you chcek if you still have the problem? My PR has been merged and I have run curl a few times and can't reproduce anymore.

2color commented 3 months ago

I'm still able to reproduce the bug, unsurprisingly always when there's a cache HIT. Have the caches been cleared after deploying the fix (specifically for gateway-bank1-fr2 based on the x-ipfs-lb-pop header)?

curl -i -H "Accept: application/vnd.ipld.dag-json" https://ipfs.io/ipfs/bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq/

HTTP/2 200
server: openresty
date: Fri, 01 Mar 2024 10:30:20 GMT
content-type: application/vnd.ipld.dag-cbor
content-length: 238
access-control-allow-headers: Content-Type
access-control-allow-headers: Range
access-control-allow-headers: User-Agent
access-control-allow-headers: X-Requested-With
access-control-allow-methods: GET
access-control-allow-methods: HEAD
access-control-allow-methods: OPTIONS
access-control-allow-origin: *
access-control-expose-headers: Content-Length
access-control-expose-headers: Content-Range
access-control-expose-headers: X-Chunked-Output
access-control-expose-headers: X-Ipfs-Path
access-control-expose-headers: X-Ipfs-Roots
access-control-expose-headers: X-Stream-Output
cache-control: public, max-age=29030400, immutable
content-disposition: attachment; filename="bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq.cbor"; filename*=UTF-8''bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq.cbor
etag: "bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq.dag-cbor"
x-content-type-options: nosniff
x-ipfs-path: /ipfs/bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq/
x-ipfs-roots: bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq
x-ipfs-pop: ipfs-bank6-fr2
timing-allow-origin: *
x-ipfs-datasize: 238
x-ipfs-lb-pop: gateway-bank1-fr2
x-bfid: 3596c07a59186b43978431b505be496e
strict-transport-security: max-age=31536000; includeSubDomains; preload
x-proxy-cache: HIT
accept-ranges: bytes

Warning: Binary output can mess up your terminal. Use "--output -" to tell
Warning: curl to output it to your terminal anyway, or consider "--output
Warning: <FILE>" to save to a file.
curl -i -H "Accept: application/vnd.ipld.dag-json" https://ipfs.io/ipfs/bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq/

HTTP/2 200
server: openresty
date: Fri, 01 Mar 2024 10:28:51 GMT
content-type: application/vnd.ipld.dag-cbor
content-length: 238
access-control-allow-headers: Content-Type
access-control-allow-headers: Range
access-control-allow-headers: User-Agent
access-control-allow-headers: X-Requested-With
access-control-allow-methods: GET
access-control-allow-methods: HEAD
access-control-allow-methods: OPTIONS
access-control-allow-origin: *
access-control-expose-headers: Content-Length
access-control-expose-headers: Content-Range
access-control-expose-headers: X-Chunked-Output
access-control-expose-headers: X-Ipfs-Path
access-control-expose-headers: X-Ipfs-Roots
access-control-expose-headers: X-Stream-Output
cache-control: public, max-age=29030400, immutable
content-disposition: attachment; filename="bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq.cbor"; filename*=UTF-8''bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq.cbor
etag: "bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq.dag-cbor"
x-content-type-options: nosniff
x-ipfs-path: /ipfs/bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq/
x-ipfs-roots: bafyreicnokmhmrnlp2wjhyk2haep4tqxiptwfrp2rrs7rzq7uk766chqvq
x-ipfs-pop: ipfs-bank6-fr2
timing-allow-origin: *
x-ipfs-datasize: 238
x-ipfs-lb-pop: gateway-bank1-fr2
x-bfid: 2b77d26838bb6ae7761b256edd345330
strict-transport-security: max-age=31536000; includeSubDomains; preload
x-proxy-cache: HIT
accept-ranges: bytes
hacdias commented 3 months ago

@ns4plabs can you take a look if it got properly deployed and the caches cleaned up?

hacdias commented 3 months ago

This seems to be fixed.