calpaterson / csvbase

a simple website for sharing table data - with an API
https://csvbase.com
GNU Affero General Public License v3.0
376 stars 13 forks source link

Canonical link header references private hostname #101

Closed Ryman closed 8 months ago

Ryman commented 9 months ago

Description

There might have been a change to the application deployment that has lead to a regression on this feature?

I'm seeing (w/ curl) the header return link: <https://backend/meripaterson/stock-exchanges>, rel="canonical" referencing what I assume is a container name rather than the intended external hostname. It may also be a reason that cloudflare's cache status is constantly cf-cache-status: EXPIRED.

Steps to reproduce

  1. $ curl -v https://csvbase.com/meripaterson/stock-exchanges.csv
  2. Note the link header value

Expected result

It should contain the csvbase.com hostname

Actual result

It references a private hostname of backend

Additional details

I get the same behaviour with and without a .csv file extension on the resource

calpaterson commented 8 months ago

Thank you so much for reporting this problem!

This issue was introduced when I removed varnish from the prod stack (which was misbehaving).

The nginx config was not quite right and due to a scoping issue, the Host header was not being set. I've just corrected this and it should be working for you - let me know if not.

Regarding why cloudflare always says "EXPIRED" (and why Varnish was removed): csvbase is aiming at an ETags-only cache strategy for data but there is still some work to be done to get it working with intermediate proxies so as it stands we have them not caching at all to avoid issues with serving stale data.

Again, thank you so much.