Closed dlksmc closed 8 months ago
Hi @dlksmc, can you provide the public url that I can try to take a look ?
Hi @quentingodeau. Here's a public example:
create or replace secret secret_mtcars (
type azure,
connection_string 'DefaultEndpointsProtocol=https;AccountName=reztest;SharedAccessSignature=sp=rl&st=2024-03-01T03:51:50Z&se=2024-03-01T11:51:50Z&spr=https&sv=2022-11-02&sr=c&sig={sas};EndpointSuffix=core.windows.net'
);
-- Works (with {sas})
select *
from 'az://$web/mtcars.parquet/**/*.parquet'
-- Works
select *
from 'https://reztest.z13.web.core.windows.net/mtcars.parquet/am=0/gear=3/part-0.parquet'
-- Doesn't work
select *
from 'https://reztest.z13.web.core.windows.net/mtcars.parquet/**/*.parquet'
hi @dlksmc
one of the problems here is that you are trying to do a glob on an http url, this can not work since duckdb will not know how to do file listing over raw http. You need to use az://
urls for that.
I'm able to do:
CREATE SECRET secret2 (
· TYPE AZURE,
· PROVIDER CONFIG,
· ACCOUNT_NAME '<acc>'
‣ );
then query a public file:
FROM 'az://bucket/file.csv'
alternatively, using fully qualified urls we can now do:
from 'az://<account>.blob.core.windows.net/<container>/file.csv';
However, globbing without authentication is not possible atm, and i'm not sure thats even possible:
Yes after some additional check I think you will not be able to do that but you can still generates a SAS key with read permission, like this it does as if you have public data and you can control the permission
Thanks for your responses, @quentingodeau and @samansmink. It turns out I needed to change the access level for the container to allow anonymous read access:
I thought enabling the $web
container already did that, but apparently not. After updating the access, the secret2
version where you just provide the account_name
works!
Thanks for the great extension! I am using Azure Storage to host a static website. I have data stored within the public
$web
container (where the html is hosted) that I can successfully query in DuckDB using aconnection_string
containing anAccountKey
orSharedAccessSignature
(as described in the extensions docs):I would like to query the data without authentication though. I've tried removing the
AccountKey
orSharedAccessSignature
portion of thesecret
config above, and tried the config suggested in the docs for authentication-less access:Unfortunately, both attempts result in an error:
NoAuthenticationInformationReason Phrase: Server failed to authenticate the request. Please refer to the information in the www-authenticate header.
.I've also tried using
https://
instead ofaz://
. Without establishing anysecret
,from https://{storage_account}.web.core.windows.net/my_table.parquet/partition_1=1/partition_2=hello/part-0.parquet'
works. Butfrom 'https://{storage_account}.web.core.windows.net/my_table.parquet/**/*.parquet'
gives a 404 error (the requested content does not exist). In other words, I'm unable to figure out how to read the full Parquet with Hive-style partitions when usinghttps://
.Am I doing something wrong, or is this not supported?