delta-io / delta

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
https://delta.io
Apache License 2.0
7.46k stars 1.67k forks source link

Azure adls gen2 SAS token #1802

Open ritwik-singh opened 1 year ago

ritwik-singh commented 1 year ago

Hello Everyone, I am working on a project where we are trying to save table using delta lake on a ADLS gen 2 using a SAS token. Is this even possible because I am not able to find any documentation for it. It only talks about oAuth for now for ADLS gen2. Also I am not using databricks. It is simple pyspark, deltalake and adls gen2.

If not then can I use shared key. If so how can I do it.

vkorukanti commented 1 year ago

I haven't tried, but here is a pointer I found. Based on this you need the following conf:

spark.conf.set("fs.azure.account.auth.type.<storage-account>.dfs.core.windows.net", "SAS")  
spark.conf.set("fs.azure.sas.token.provider.type.<storage-account>.dfs.core.windows.net", "org.apache.hadoop.fs.azurebfs.sas.FixedSASTokenProvider")  
spark.conf.set("fs.azure.sas.fixed.token.<storage-account>.dfs.core.windows.net", "<token>")  
spark.read.csv("abfss://<CONTAINER>@<STORAGE ACCOUNT>.dfs.core.windows.net/<PATH>/<FILE>.csv")  

Could you try passing the above configs when starting Spark shell with Delta libs?