Closed zhousPole closed 5 months ago
use spark-greenplum-connector_2.12-3.1.jar
I have found the problem(SparkSchemaUtil.scala guessMaxParallelTasks method): sparkContext.getExecutorMemoryStatus.keys.size - 1 always 0,So there's a dead loop Here is the version I modified the guessMaxParallelTasks method:
def guessMaxParallelTasks(): Int = {
val sparkContext = SparkContext.getOrCreate
var guess: Int = -1
val osName = System.getProperty("os.name")
var isLocal: Boolean = false
if (osName.toLowerCase().contains("windows") || osName.toLowerCase().contains("mac")) {
isLocal = true
}
if (isLocal) {
guess = sparkContext.getConf.getInt("spark.default.parallelism", 1) - 1;
} else {
while ((guess <= 0) && !Thread.currentThread().isInterrupted) {
guess = sparkContext.getExecutorMemoryStatus.keys.size - 1
if (sparkContext.deployMode == "cluster")
guess -= 1
}
}
guess
}
use example:
gpdf.printSchema() It can be printed normally use windows10 PowerShell: