Closed agiudiceandrea closed 4 years ago
The issue is specific to this particular server. a HEAD request returns a unreliable Content-Length: 0, after following the redirection:
$ curl -I -L https://query.data.world/s/jgsghstpphjhicstradhy5kpjwrnfy
HTTP/1.1 301 Moved Permanently
Date: Sat, 03 Oct 2020 21:27:44 GMT
Content-Length: 0
Connection: keep-alive
Server: nginx
Location: https://download.data.world/file_download/ondata/covid-19-italia-dati-dipartimento-protezione-civile/dpc-covid19-ita-province.csv?auth=eyJhbGciOiJIUzUxMiJ9.eyJzdWIiOiJwcm9kLXVzZXItY2xpZW50OnBpZ3JlY29pbmZpbml0byIsImlzcyI6ImFnZW50OnBpZ3JlY29pbmZpbml0bzo6YjNiYjc5YmItODA0NS00ZmY5LWI0NzAtY2ZhMGVhOTA4NDY3IiwiaWF0IjoxNTgzOTU3NjgxLCJyb2xlIjpbInVzZXIiLCJ1c2VyX2FwaV9hZG1pbiIsInVzZXJfYXBpX3JlYWQiLCJ1c2VyX2FwaV93cml0ZSJdLCJnZW5lcmFsLXB1cnBvc2UiOmZhbHNlLCJ1cmwiOiI2YWU4NWIyMTMwZjBiN2QxNjM4ZDY1MGUxMmIzNjc4ODFkMDJjMmY0In0.--2gOMnWY_iAkI21UBShHWmkmea8xwfro5fWQ22VuPd1XBvKJ2DnEtwGulHYM4H2C64eTtfUnjeBO6jxQMVKhQ
Vary: Origin
HTTP/1.1 200 OK
Date: Sat, 03 Oct 2020 21:27:46 GMT
Content-Type: text/csv
Content-Length: 0
Connection: keep-alive
Server: nginx
Content-Disposition: attachment;filename="dpc-covid19-ita-province.csv"
Thanks. Successfully tested with the latest build of GDAL 3.1 (d0541ea) from GISInternals (after updating curl-ca-bundle.crt https://github.com/gisinternals/buildsystem/issues/162).
Searching for the source of a QGIS issue https://github.com/qgis/QGIS/issues/35016, after testing various QGIS and GDAL version on Windows and Ubuntu Linux, I found that OGR fails to open a VRT layer file, with a remote data source (a csv file) hosted on data.world server, if VSI_CACHE is set to TRUE.
A test VRT file is available at https://raw.githubusercontent.com/agiudiceandrea/TEST_CSV/master/covid19-regioni_dw.vrt
On both Windows (7 / 10 with GDAL/OGR 3.0.4 - 3.2.0dev) and Ubuntu Linux ( 18.04 / 20.04 with GDAL/OGR 2.2.3 - 3.0.4)
if VSI_CACHE is not set or set to FALSE
ogrinfo covid19-regioni_dw.vrt jgsghstpphjhicstradhy5kpjwrnfy -summary
outputs
and QGIS can properly display the layer
while if VSI_CACHE is set to TRUE
ogrinfo covid19-regioni_dw.vrt jgsghstpphjhicstradhy5kpjwrnfy -summary
outputs
and QGIS cannot display the layer.
The QGIS issue was spotted on Windows since the OSGeo4W QGIS installer sets by default VSI_CACHE=TRUE and VSI_CACHE_SIZE=1000000: it seems these settings were introduced 7 years ago (https://github.com/qgis/QGIS/commit/9222f152b30948a3a9d9f66dffd162548022f7fd) to workaround a reported OGR performance issue (https://github.com/qgis/QGIS/issues/15688) about shapefile reading over network on Windows.