Azure / azure-sqldb-spark

This project provides a client library that allows Azure SQL DB or SQL Server to act as an input source or output sink for Spark jobs.
MIT License
75 stars 52 forks source link

Use SBT and Spark 2.4 for cross-building for Scala 2.12 #61

Closed nightscape closed 4 years ago

nightscape commented 4 years ago

This PR replaces the Maven build with an SBT one in order to adhere to Scala standards and facilitate cross-building.

Samuman93 commented 4 years ago

Hello im trying to use this connector but i am unable to compile it with sbt. I need this version because i am using scala 2.12, would you kindly help me?

nightscape commented 4 years ago

@Samuman93 you need to be more specific about what's not working. Please describe what you were doing and what errors occurred.

Samuman93 commented 4 years ago

I git clone the repo, and when i use sbt compile or sbt package the console outputs:

``$ sbt compile [warn] No sbt.version set in project/build.properties, base directory: C:\mypath\spark\azure-sqldb-spark [info] Loading global plugins from C:\mypath.sbt\1.0\plugins [info] Set current project to azure-sqldb-spark (in build file:/C:/mypath/spark/azure-sqldb-spark/) [info] Executing in batch mode. For better performance use sbt's shell [info] Compiling 13 Scala sources and 4 Java sources to C:\mypath\spark\azure-sqldb-spark\target\scala-2.12\classes ... [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\LoggingTrait.scala:25:12: object slf4j is not a member of package org [error] import org.slf4j.{Logger, LoggerFactory} [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\LoggingTrait.scala:31:33: not found: type Logger [error] @transient private var log : Logger = null // scalastyle:ignore [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\LoggingTrait.scala:40:22: not found: type Logger [error] protected def log: Logger = { [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\LoggingTrait.scala:43:14: not found: value LoggerFactory [error] log = LoggerFactory.getLogger(logName) [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\config\Config.scala:26:12: object apache is not a member of package org [error] import org.apache.spark.sql.SparkSession [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\config\Config.scala:27:12: object apache is not a member of package org [error] import org.apache.spark.{SparkConf, SparkContext} [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\config\Config.scala:189:27: not found: type SparkContext [error] def apply(sparkContext: SparkContext): Config = apply(sparkContext.getConf) [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\config\Config.scala:223:24: not found: type SparkConf [error] def apply(sparkConf: SparkConf, options: collection.Map[String, String]): Config = [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\config\Config.scala:211:24: not found: type SparkConf [error] def apply(sparkConf: SparkConf): Config = apply(sparkConf, Map.empty[String, String]) [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\config\Config.scala:200:27: not found: type SparkSession [error] def apply(sparkSession: SparkSession): Config = apply(sparkSession.sparkContext.getConf) [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\config\Config.scala:274:37: not found: type SparkConf [error] def getOptionsFromConf(sparkConf: SparkConf): collection.Map[String, String] = [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\connect\package.scala:27:12: object apache is not a member of package org [error] import org.apache.spark.sql. [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\connect\package.scala:67:76: not found: type Row [error] implicit def toDataFrameFunctions[T](ds: Dataset[T]): DataFrameFunctions[Row] = DataFrameFunctions[Row](ds.toDF()) [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\connect\package.scala:67:44: not found: type Dataset [error] implicit def toDataFrameFunctions[T](ds: Dataset[T]): DataFrameFunctions[Row] = DataFrameFunctions[Row](ds.toDF()) [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\connect\package.scala:55:51: not found: type DataFrameWriter [error] implicit def toDataFrameWriterFunctions(writer: DataFrameWriter[]): DataFrameWriterFunctions = [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\connect\package.scala:43:51: not found: type DataFrameReader [error] implicit def toDataFrameReaderFunctions(reader: DataFrameReader): DataFrameReaderFunctions = [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\connect\DataFrameFunctions.scala:32:12: object apache is not a member of package org [error] import org.apache.spark.sql.{DataFrame, Row} [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\connect\DataFrameFunctions.scala:39:71: not found: type DataFrame [error] private[spark] case class DataFrameFunctions[T](@transient dataFrame: DataFrame) extends LoggingTrait { [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\connect\DataFrameFunctions.scala:97:59: not found: type Row [error] private def bulkCopy(config: Config, iterator: Iterator[Row], metadata: BulkCopyMetadata): Unit = { [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\java\com\microsoft\azure\sqldb\spark\bulkcopy\SQLServerBulkDataFrameFileRecord.java:28:8: object apache is not a member of package org [error] import org.apache.spark.sql.Row; [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\java\com\microsoft\azure\sqldb\spark\bulkcopy\SQLServerBulkDataFrameFileRecord.java:48:54: not found: type Row [error] public SQLServerBulkDataFrameFileRecord(Iterator iterator, BulkCopyMetadata metadata) { [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\connect\DataFrameReaderFunctions.scala:30:12: object apache is not a member of package org [error] import org.apache.spark.sql.{DataFrame, DataFrameReader} [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\connect\DataFrameReaderFunctions.scala:35:71: not found: type DataFrameReader [error] private[spark] case class DataFrameReaderFunctions(@transient reader: DataFrameReader) extends LoggingTrait { [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\connect\DataFrameReaderFunctions.scala:43:34: not found: type DataFrame [error] def sqlDB(readConfig: Config): DataFrame = { [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\connect\DataFrameReaderFunctions.scala:59:66: not found: type DataFrame [error] def sqlDB(url: String, table: String, properties: Properties): DataFrame = { [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\connect\DataFrameReaderFunctions.scala:72:93: not found: type DataFrame [error] def sqlDB(url: String, table: String, predicates: Array[String], properties: Properties): DataFrame = { [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\connect\DataFrameReaderFunctions.scala:89:76: not found: type DataFrame [error] upperBound: Long, numPartitions: Int, properties: Properties): DataFrame = { [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\connect\DataFrameWriterFunctions.scala:31:12: object apache is not a member of package org [error] import org.apache.spark.sql.DataFrameWriter [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\connect\DataFrameWriterFunctions.scala:36:71: not found: type DataFrameWriter [error] private[spark] case class DataFrameWriterFunctions(@transient writer: DataFrameWriter[_]) extends LoggingTrait { [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\connect\package.scala:26:12: object apache is not a member of package org [error] import org.apache.spark.annotation.DeveloperApi [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\connect\package.scala:42:4: not found: type DeveloperApi [error] @DeveloperApi [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\connect\package.scala:54:4: not found: type DeveloperApi [error] @DeveloperApi [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\connect\package.scala:66:4: not found: type DeveloperApi [error] @DeveloperApi [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\connect\package.scala:67:102: not found: type Row [error] implicit def toDataFrameFunctions[T](ds: Dataset[T]): DataFrameFunctions[Row] = DataFrameFunctions[Row](ds.toDF()) [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\query\QueryFunctions.scala:31:12: object apache is not a member of package org [error] import org.apache.spark.sql.{DataFrame, SQLContext} [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\query\QueryFunctions.scala:36:65: not found: type SQLContext [error] private[spark] case class QueryFunctions(@transient sqlContext: SQLContext) extends LoggingTrait { [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\query\QueryFunctions.scala:45:42: not found: type DataFrame [error] def sqlDBQuery(config: Config): Either[DataFrame, Boolean] = { [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\query\package.scala:27:12: object apache is not a member of package org [error] import org.apache.spark.sql.SQLContext [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\query\package.scala:43:45: not found: type SQLContext [error] implicit def toQueryFunctions(sqlContext: SQLContext): QueryFunctions = QueryFunctions(sqlContext) [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\query\package.scala:26:12: object apache is not a member of package org [error] import org.apache.spark.annotation.DeveloperApi [error] ^ [error] C:\mypath\spark\azure-sqldb-spark\src\main\scala\com\microsoft\azure\sqldb\spark\query\package.scala:42:4: not found: type DeveloperApi [error] @DeveloperApi [error] ^ [error] 41 errors found [error] (Compile / compileIncremental) Compilation failed [error] Total time: 4 s, completed 05-may-2020 12:01:40

Samuman93 commented 4 years ago

Any news of this?

arvindshmicrosoft commented 4 years ago

Thank you @nightscape for your contribution. Unfortunately, as this project is not actively maintained, we will not be able to merge this in. The newer connector here already uses SBT. It would be great if you can evaluate, use and hopefully contribute to that project. Closing this PR.