developmentseed / obstore

Simple, fast integration with Amazon S3, Google Cloud Storage, Azure Storage, and S3-compliant APIs like Cloudflare R2
https://developmentseed.org/obstore
MIT License
147 stars 3 forks source link

error with Fabric #57

Open djouallah opened 4 weeks ago

djouallah commented 4 weeks ago

using this code

import obstore as obs
from obstore.store import AzureStore
import configparser
config = configparser.ConfigParser()
config.read("C:/KV/variable.ini")
store = AzureStore.from_url( url="abfss://taxi@onelake.dfs.fabric.microsoft.com/NYT.Lakehouse/",
                            config={
                                    "tenant_id": config.get("myvars", "tenantId"),
                                    "client_secret":config.get("myvars", "secret"),
                                    "client_id":config.get("myvars", "appId"),
                                    "use_fabric_endpoint": 'True'
                                    }
                              )
stream = obs.list(store)
for list_result in stream:
    print(list_result[0])
    break

I am getting this error

GenericError                              Traceback (most recent call last)
Cell In[23], [line 2](vscode-notebook-cell:?execution_count=23&line=2)
      [1](vscode-notebook-cell:?execution_count=23&line=1) stream = obs.list(store)
----> [2](vscode-notebook-cell:?execution_count=23&line=2) for list_result in stream:
      [3](vscode-notebook-cell:?execution_count=23&line=3)     print(list_result[0])
      [4](vscode-notebook-cell:?execution_count=23&line=4)     break

GenericError: Generic MicrosoftAzure error: Error performing list request: Client error with status 400 Bad Request: <?xml version="1.0" encoding="utf-8"?>
<Error>
  <Code>BadRequest</Code>
  <Message>Either Workspa

i know the credential are fine, as they works with duckdb

kylebarron commented 3 weeks ago

Hmm. I've tested this so far on AWS and GCS but not yet on Azure.

The underlying Rust crate is pretty well tested so I assume that Azure can work, and it's just a matter of validating the input correctly and documenting what needs to be passed.

I'm not quite sure the best way to debug this.

Do you know of any docs or examples in other libraries of connecting to fabric?

djouallah commented 3 weeks ago

maybe I will answer my own question. using delta_rs which depends on object store crate

credential = ClientSecretCredential(
                        client_id     =config.get("myvars", "appId"),
                        client_secret =config.get("myvars", "secret") , 
                        tenant_id     =config.get("myvars", "tenantId")
                        )
 token =       credential.get_token("https://storage.azure.com/.default").token
storage_options= {"bearer_token": get_token(), "use_fabric_endpoint": "true"}

let me check again with the token approach

djouallah commented 3 weeks ago

same error

GenericError                              Traceback (most recent call last)
Cell In[7], [line 2](vscode-notebook-cell:?execution_count=7&line=2)
      [1](vscode-notebook-cell:?execution_count=7&line=1) stream = obs.list(store)
----> [2](vscode-notebook-cell:?execution_count=7&line=2) for list_result in stream:
      [3](vscode-notebook-cell:?execution_count=7&line=3)     print(list_result[0])
      [4](vscode-notebook-cell:?execution_count=7&line=4)     break

GenericError: Generic MicrosoftAzure error: Error performing list request: Client error with status 400 Bad Request: <?xml version="1.0" encoding="utf-8"?>
<Error>
  <Code>BadRequest</Code>
  <Message>Either WorkspaceId or ArtifactId are missing in the request</Message>
</Error>
kylebarron commented 3 weeks ago

using delta_rs which depends on object store crate

Ok, so if it works with delta_rs, then there is some configuration that will work here, and we just have to figure out what it is and document it.

  <Message>Either WorkspaceId or ArtifactId are missing in the request</Message>

Seems like that's a clue to what's not getting set correctly.