Azure / azure-sdk-for-java

This repository is for active development of the Azure SDK for Java. For consumers of the SDK we recommend visiting our public developer docs at https://docs.microsoft.com/java/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-java.
MIT License
2.36k stars 2k forks source link

Azure Cosmos Emulator - Can i use spark3 connector for mongo db api or sql api #33317

Closed WillianMattosRibeiro closed 1 year ago

WillianMattosRibeiro commented 1 year ago

Tried to read data with the connector from docker container with a cosmos db emulator whitout success. Is it possible?

image

ghost commented 1 year ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @kushagraThapar, @TheovanKraay

TheovanKraay commented 1 year ago

First, this connector will only work for core NoSQL (SQL API). Also, in order to connect to emulator with any Java based connector, you would need to export certificates, did you do this already? https://learn.microsoft.com/en-us/azure/cosmos-db/local-emulator-export-ssl-certificates. If already done, please post comment with more information about the error.

WillianMattosRibeiro commented 1 year ago

I’ve tried to connect via sql api and via mongo api. Both cases i got stuck on how to setup the certificates.

Do you have any docs for the spark3 connector? The Only i found was this one: https://github.com/Azure/azure-cosmosdb-spark but is for spark2https://github.com/Azure/azure-cosmosdb-spark%20but%20is%20for%20spark2 which leads me to this one: https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/cosmos/azure-cosmos-spark_3_2-12/docs/quick-start.md

i found this instructions:

@.***

But the url sugested, seems that i need to use this config with cloud cosmos.

The emulator is in my localhost and i can use it already with the python lib:

@.***

In this lib i use the port 8081 for connection and i can set the parameter “connection _verify = False” as show below:

@.***

Then, i tried to use the same logic to connect via spark conector:

@.***

found any docs for this connector to check possible parameters to set certificate path for exemplo the same way i do in python lib. This code works in python because i can set the certificates:

@.***

How can i do that in the spark connector?

De: Theo van Kraay @.> Enviada em: segunda-feira, 6 de fevereiro de 2023 15:03 Para: Azure/azure-sdk-for-java @.> Cc: WillianMattosRibeiro @.>; Author @.> Assunto: Re: [Azure/azure-sdk-for-java] Azure Cosmos Emulator - Can i use spark3 connector for mongo db api or sql api (Issue #33317)

First, this connector will only work for core NoSQL (SQL API). Also, in order to connect to emulator with any Java based connector, you would need to export certificates, did you do this already? https://learn.microsoft.com/en-us/azure/cosmos-db/local-emulator-export-ssl-certificates. If already done, please post comment with more information about the error.

— Reply to this email directly, view it on GitHubhttps://github.com/Azure/azure-sdk-for-java/issues/33317#issuecomment-1419517992, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AECD7CUQ4BSLZ7LGMXQL3QTWWE4FFANCNFSM6AAAAAAUQKLC7Y. You are receiving this because you authored the thread.Message ID: @.**@.>>

TheovanKraay commented 1 year ago

I mentioned the docs above. The section to use certificate with java apps is here: https://learn.microsoft.com/en-us/azure/cosmos-db/local-emulator-export-ssl-certificates#use-the-certificate-with-java-apps. This would be required for Spark Connector because it depends on Java SDK. You are right that we do not have any documentation specifically on local docker setup with Spark and emulator, but all of the instructions would be the same, apart from exporting certs, and of course installing the Spark connector in your local Spark environment.

WillianMattosRibeiro commented 1 year ago

I’ve tried this way:

pyspark --packages com.azure.cosmos.spark:azure-cosmos-spark_3-2_2-12:4.15.0 --driver-java-options "-Djavax.net.ssl.trustStore=certificados/key.crt -Djavax.net.ssl.trustStorePassword='C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw=='"

from pyspark.sql.functions import lit cosmosEndpoint = "https://localhost:8081/" cosmosMasterKey = "C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw==" cosmosDatabaseName = "SampleDB" cosmosContainerName = "Persons"

cfg = { "spark.cosmos.accountEndpoint": cosmosEndpoint, "spark.cosmos.accountKey": cosmosMasterKey, "spark.cosmos.database": cosmosDatabaseName, "spark.cosmos.container": cosmosContainerName, }

spark.conf.set("spark.sql.catalog.cosmosCatalog", "com.azure.cosmos.spark.CosmosCatalog") spark.conf.set( "spark.sql.catalog.cosmosCatalog.spark.cosmos.accountEndpoint", cosmosEndpoint) spark.conf.set( "spark.sql.catalog.cosmosCatalog.spark.cosmos.accountKey", cosmosMasterKey)

query = "select * from cosmosCatalog.{}.{};".format( cosmosDatabaseName, cosmosContainerName)

df_people = spark.sql(query) df_people.show()

But got this error:

@.***

am i doing something wrong or the conector just dont have suport for this type of connection? Can you send me a sample?

De: Theo van Kraay @.> Enviada em: segunda-feira, 6 de fevereiro de 2023 15:49 Para: Azure/azure-sdk-for-java @.> Cc: WillianMattosRibeiro @.>; Author @.> Assunto: Re: [Azure/azure-sdk-for-java] Azure Cosmos Emulator - Can i use spark3 connector for mongo db api or sql api (Issue #33317)

I mentioned the docs above. The section to use certificate with java apps is here: https://learn.microsoft.com/en-us/azure/cosmos-db/local-emulator-export-ssl-certificates#use-the-certificate-with-java-apps. This would be required for Spark Connector because it depends on Java SDK. You are right that we do not have any documentation specifically on local docker setup with Spark and emulator, but all of the instructions would be the same, apart from exporting certs, and of course installing the Spark connector in your local Spark environment.

— Reply to this email directly, view it on GitHubhttps://github.com/Azure/azure-sdk-for-java/issues/33317#issuecomment-1419579227, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AECD7CTOCZUPGNOOJKQYO3TWWFBSRANCNFSM6AAAAAAUQKLC7Y. You are receiving this because you authored the thread.Message ID: @.**@.>>

TheovanKraay commented 1 year ago

No error given above and guidance already given. Closing due to age of issue. If this is critical please open a support case.