Azure / spark-cdm-connector

MIT License
76 stars 33 forks source link

Spark CDM connector v1.19.2 throws "Manifest does not exists" message #90

Closed asharma1992 closed 2 years ago

asharma1992 commented 2 years ago

We are currently migrating from connector v0.19.1 to Spark CDM connector v1.19.2 ( for Spark 3 ). While writing the dataframe to a folder on ADLS, it is not creating the manifest and CDM corpus folders.

It logs the following message: "Manifest doesn't exist: Map-Entity.manifest.cdm.json"

asharma1992 commented 2 years ago

Additional Info, snippet of code that we use and is working correctly for v0.19.1 connector :

(df.limit(0).write.format("com.microsoft.cdm")
            .option("storage", storage_account_name + '.dfs.core.windows.net')
            .option("manifestPath", manifest_path)
            .option("entity", entity_name)
            .option("appId", app_id)
            .option("appKey", app_key)
            .option("tenantId", tenant_id)
            .mode(write_mode)
            .option("columnHeaders", False)
            .option("format", "parquet")
            .option("compression", "gzip")
            .save())

This write is from databricks, is the credential passthrough known issue causing this and if yes what is the workaround.

asharma1992 commented 2 years ago

@bissont any timelines on when this can be fixed? we are stuck with Spark 2 connector and databricks spark 2 clusters are reaching end of service.

TissonMathew commented 2 years ago

Spark 2 is eol