Open daviewales opened 10 months ago
Here the problem is that Firefox takes a different (than Safari or Google/Edge) take on what's the Content-Length returned from a plain-text resource with gzip encoding.
Chrome:
Firefox:
And we currently assume this Content-Length to be the actual content length of the uncompressed original file.
I am not sure what can be done here, and need to test this in node or other relevant non-browser embeddings.
From what I can tell, it looks like Firefox is following the spec correctly (RFC 9110). (Unless I've misread it.)
In this case, you should be able to get the uncompressed size by reading the ISIZE
field from the gzip header.
See RFC 1952.
Looking at the screenshots above, I can see that content-encoding=gzip in the Firefox headers. However, I can't see that in the Chrome headers. Is it just not visible in the screenshot, or has the data been sent uncompressed to Chrome? This would explain why the content-length is different between Chrome and Firefox.
Also, strangely I can't reproduce the issue in Firefox on my Mac. I'm running Firefox 115 ESR on my Mac.
Chrome is playing man-on-the-middle both on HEAD and GET requests, so basically to the JavaScript-side the fact that there was compression it's not really visible.
Firefox appear to be playing man-on-the-middle only on GET requests, so I still think that while this is very strange, it's Firefox behaviour here to be off. I have digged out this old bug-report: https://bugzilla.mozilla.org/show_bug.cgi?id=986924, but I am not sure.
Also note that if the file is NOT plain text, Chrome and Firefox agree on sending the original length.
I will have to dig more into this.
I also have a Mac, Firefox 121.0.1 (64-bit) gives the error.
I reported what I think it's the Firefox bug on bugzilla: https://bugzilla.mozilla.org/show_bug.cgi?id=1874840
Thanks!
Looking at your test site, I get this in Firefox 115 ESR:
https://raw.githubusercontent.com/duckdb/duckdb_spatial/main/test/data/nyc_taxi/taxi_zones/taxi_zones.prj
347 using GET
562 using GET's arraybuffer's byteLength
347 using HEAD
https://raw.githubusercontent.com/duckdb/duckdb_spatial/main/test/data/nyc_taxi/taxi_zones/taxi_zones.shx
2204 using HEAD
2204 using GET
2204 using GET's arraybuffer's byteLength
https://raw.githubusercontent.com/duckdb/duckdb_spatial/main/test/data/nyc_taxi/taxi_zones/taxi_zones.shp
1586264 using HEAD
1586264 using GET
1586264 using GET's arraybuffer's byteLength
Is this what you get?
According to this comment: https://bugzilla.mozilla.org/show_bug.cgi?id=1874840#c3 there might be a problem in how Firefox implement the spec, will update on this.
And while looking at this, there is potentially also a Chrome bug in the same area: https://bugs.chromium.org/p/chromium/issues/detail?id=1519411
Bottom line for the moment duckdb-wasm + Firefox will fail on reading some files, I believe this is mostly relevant in the case of Spatial, I will wait a bit to see what's the assessment from the browser, I might need to add special handling around this in our networking code.
What happens?
I tried to run the sample code in DuckDB WASM linked from the latest blog post.
When I clicked this link, I got the error
Invalid Error: missing , or ]
: https://shell.duckdb.org/#queries=v0,INSTALL-spatial~,LOAD-spatial~,CREATE-TABLE-nyc-AS-SELECT-borough%2C-st_union_agg(geom)-AS-full_geom%2C-st_area(full_geom)-AS-area%2C-st_centroid(full_geom)-AS-centroid%2C-count(*)-AS-count-FROM-st_read('https%3A%2F%2Fraw.githubusercontent.com%2Fduckdb%2Fduckdb_spatial%2Fmain%2Ftest%2Fdata%2Fnyc_taxi%2Ftaxi_zones%2Ftaxi_zones.shp')-GROUP-BY-borough~,SELECT-borough%2C-area%2C-centroid%3A%3AVARCHAR%2C-count-FROM-nyc~But this error only occurs in Firefox. It works perfectly in Chrome. I'm using the latest Firefox 121.0.1.
To Reproduce
Click the following link in Firefox:
https://shell.duckdb.org/#queries=v0,INSTALL-spatial~,LOAD-spatial~,SELECT-borough-FROM-st_read('https%3A%2F%2Fraw.githubusercontent.com%2Fduckdb%2Fduckdb_spatial%2Fmain%2Ftest%2Fdata%2Fnyc_taxi%2Ftaxi_zones%2Ftaxi_zones.shp')-limit-1~
Browser/Environment:
Firefox 121.0.1
Device:
HP Probook Laptop
DuckDB-Wasm Version:
1.28.1-dev81.0
DuckDB-Wasm Deployment:
shell.duckdb.org
Full Name:
David Wales
Affiliation:
South Western Sydney Primary Health Network