Azure / azure-cosmosdb-spark

Apache Spark Connector for Azure Cosmos DB
MIT License
202 stars 121 forks source link

Spark 3.0 builds #405

Closed alexnoox closed 3 years ago

alexnoox commented 4 years ago

Hi,

Are you still planning to release builds for Spark 3.0 and Scala 2.12?

Thank you.

jenden commented 4 years ago

+1 it would be great to have support for Spark 3.0. This is the last library holding me back from making the change.

chinwobble commented 4 years ago

+1 we would like to have spark 3 support as well.

marycar commented 4 years ago

+1 Hi, is there any news on the release date of the cosmos db connector for spark 3.0?

bigdatamoore commented 4 years ago

+1 on Spark 3 updates

mkulisic commented 3 years ago

+1 this would be great.

todddube commented 3 years ago

We are looking for this as well any update on this ?

FabianMeiswinkel commented 3 years ago

We are planning to release a new Connector supporting Spark 3.0 before Databricks stops supporting Spark 2.* in their LTS Runtime 6.4 at end-of-April. We will also release a preview in end-of-February/early-March

PerFlodinVolvo commented 3 years ago

That is very good news, @FabianMeiswinkel . This new Connector, will it be backwards compatible with the current Connector? I.e. if an application is using the current (latest) Connector, then the change to have Spark 3.0 support will not require any code changes?

todddube commented 3 years ago

This is great! TY for the posting the update @FabianMeiswinkel

laurencewells commented 3 years ago

@FabianMeiswinkel any update on the preview for Spark 3?

kandelsiva commented 3 years ago

We cloned the 3.0 branch locally and used the jar to support spark 3.0. We are facing performance issues- Upsert operations for ~1.5M with 2.4 it took 1 hr and with 3.0, it's taking about ~5 hrs.

Are there any known performance issues with 3.0 branch ?

mahammada commented 3 years ago

Hi Team,

Can you please update when we expect Cosmos db spark connector jar file which supports Spark 3.x version since Databricks runtime version for 6.4 is getting end of support on April 1st 2021 but till now we don't have Cosmos DB connect for Spark not available for Spark 3.x version. Kindly provide us an update when we can expect Cosmos DB Spark connector for Spark 3.x version. If it is already available please do update without fail.

Note: Currently we are in the process move our build to DBR 7.4 version which supports Spark 3.x version.

Thank you in anticipation!!

Best Regards, Mahammad Khan +91-9885360726

prabgemini commented 3 years ago

We are planning to release a new Connector supporting Spark 3.0 before Databricks stops supporting Spark 2.* in their LTS Runtime 6.4 at end-of-April. We will also release a preview in end-of-February/early-March

HI @FabianMeiswinkel Any update on this? Thanks,

gmvg commented 3 years ago

We are planning to release a new Connector supporting Spark 3.0 before Databricks stops supporting Spark 2.* in their LTS Runtime 6.4 at end-of-April. We will also release a preview in end-of-February/early-March

HI @FabianMeiswinkel Any update on this? Thanks,

Any update on this? we too rely on this dependency.

kk921dbg commented 3 years ago

We are planning to release a new Connector supporting Spark 3.0 before Databricks stops supporting Spark 2.* in their LTS Runtime 6.4 at end-of-April. We will also release a preview in end-of-February/early-March

Any update on this, has it been released yet in preview or GA? we are reliant on this also

FabianMeiswinkel commented 3 years ago

The new Cosmos DB Spark connector has been released. The Maven coordinates (which can be used to install the connector in Databricks) are "com.azure.cosmos.spark:azure-cosmos-spark_3-1_2-12:4.0.0"

The source code for the new connector is located here: https://github.com/Azure/azure-sdk-for-java/tree/master/sdk/cosmos/azure-cosmos-spark_3-1_2-12

A migration guide to change applications which used the Spark 2.4 connector is located here: https://github.com/Azure/azure-sdk-for-java/blob/master/sdk/cosmos/azure-cosmos-spark_3-1_2-12/docs/migration.md

The quick start introduction: https://github.com/Azure/azure-sdk-for-java/blob/master/sdk/cosmos/azure-cosmos-spark_3-1_2-12/docs/quick-start.md Config Reference: https://github.com/Azure/azure-sdk-for-java/blob/master/sdk/cosmos/azure-cosmos-spark_3-1_2-12/docs/configuration-reference.md End-to-end samples: https://github.com/Azure/azure-sdk-for-java/blob/master/sdk/cosmos/azure-cosmos-spark_3-1_2-12/Samples/Python/NYC-Taxi-Data/01_Batch.ipynb

balajikaadi commented 3 years ago

Hi Fabian, I am getting the below error after adding the dependency in the pom file and running my application in Azure databricks Azure CosmosDB load failed ; Error: Failed to find data source: cosmos.oltp. Please find packages at http://spark.apache.org/third-party-projects.html