swoop-inc / spark-alchemy

Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
https://swoop-inc.github.io/spark-alchemy/
Apache License 2.0
187 stars 34 forks source link

Installing spark-alchemy on spark 3.0 breaks reading dataframes in #12

Closed Fungie closed 3 years ago

Fungie commented 4 years ago

I recently tried installing spark-alchemy using spark 3.0 using the following:

spark-shell --repositories https://dl.bintray.com/swoop-inc/maven/ --packages com.swoop:spark-alchemy_2.12:1.0.0

However, when in the shell, I can't read in any files. The following code returns:

val df = spark.read.parquet(“path/to/file")

Note: When I do a regular spark-shell I can read in the data fine

20/09/21 13:16:17 ERROR Utils: Aborting task                        (0 + 1) / 1]

java.io.IOException: Failed to connect to m298188dljg5j.symc.symantec.com/172.19.129.194:49801

       at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:253)

       at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:195)

       at org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:392)

       at org.apache.spark.rpc.netty.NettyRpcEnv.$anonfun$openChannel$4(NettyRpcEnv.scala:360)

       at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)

       at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1411)

       at org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:359)

       at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:719)

       at org.apache.spark.util.Utils$.fetchFile(Utils.scala:535)

       at org.apache.spark.executor.Executor.$anonfun$updateDependencies$7(Executor.scala:869)

       at org.apache.spark.executor.Executor.$anonfun$updateDependencies$7$adapted(Executor.scala:860)

       at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:877)

       at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149)

       at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237)

       at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230)

       at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44)

       at scala.collection.mutable.HashMap.foreach(HashMap.scala:149)

       at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:876)

       at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:860)

       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:404)

       at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)

       at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)

       at java.base/java.lang.Thread.run(Thread.java:834)

Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Operation timed out: m298188dljg5j.symc.symantec.com/172.19.129.194:49801

Caused by: java.net.ConnectException: Operation timed out

       at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

       at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779)

       at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)

       at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)

       at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702)

       at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)

       at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)

       at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)

       at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)

       at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)

       at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)

       at java.base/java.lang.Thread.run(Thread.java:834)

20/09/21 13:16:17 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)

java.io.IOException: Failed to connect to m298188dljg5j.symc.symantec.com/172.19.129.194:49801

       at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:253)

       at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:195)

       at org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:392)

       at org.apache.spark.rpc.netty.NettyRpcEnv.$anonfun$openChannel$4(NettyRpcEnv.scala:360)

       at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)

       at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1411)

       at org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:359)

       at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:719)

       at org.apache.spark.util.Utils$.fetchFile(Utils.scala:535)

       at org.apache.spark.executor.Executor.$anonfun$updateDependencies$7(Executor.scala:869)

       at org.apache.spark.executor.Executor.$anonfun$updateDependencies$7$adapted(Executor.scala:860)

       at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:877)

       at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149)

       at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237)

       at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230)

       at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44)

       at scala.collection.mutable.HashMap.foreach(HashMap.scala:149)

       at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:876)

       at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:860)

       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:404)

       at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)

       at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)

       at java.base/java.lang.Thread.run(Thread.java:834)

Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Operation timed out: m298188dljg5j.symc.symantec.com/172.19.129.194:49801

Caused by: java.net.ConnectException: Operation timed out

       at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

       at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779)

       at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)

       at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)

       at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702)

       at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)

       at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)

       at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)

       at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)

       at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)

       at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)

       at java.base/java.lang.Thread.run(Thread.java:834)

20/09/21 13:16:17 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, m298188dljg5j.symc.symantec.com, executor driver): java.io.IOException: Failed to connect to m298188dljg5j.symc.symantec.com/172.19.129.194:49801

       at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:253)

       at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:195)

       at org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:392)

       at org.apache.spark.rpc.netty.NettyRpcEnv.$anonfun$openChannel$4(NettyRpcEnv.scala:360)

       at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)

       at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1411)

       at org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:359)

       at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:719)

       at org.apache.spark.util.Utils$.fetchFile(Utils.scala:535)

       at org.apache.spark.executor.Executor.$anonfun$updateDependencies$7(Executor.scala:869)

       at org.apache.spark.executor.Executor.$anonfun$updateDependencies$7$adapted(Executor.scala:860)

       at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:877)

       at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149)

       at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237)

       at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230)

       at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44)

       at scala.collection.mutable.HashMap.foreach(HashMap.scala:149)

       at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:876)

       at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:860)

       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:404)

       at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)

       at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)

       at java.base/java.lang.Thread.run(Thread.java:834)

Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Operation timed out: m298188dljg5j.symc.symantec.com/172.19.129.194:49801

Caused by: java.net.ConnectException: Operation timed out

       at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

       at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779)

       at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)

       at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)

       at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702)

       at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)

       at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)

       at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)

       at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)

       at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)

       at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)

       at java.base/java.lang.Thread.run(Thread.java:834)

20/09/21 13:16:17 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times; aborting job

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, m298188dljg5j.symc.symantec.com, executor driver): java.io.IOException: Failed to connect to m298188dljg5j.symc.symantec.com/172.19.129.194:49801

       at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:253)

       at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:195)

       at org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:392)

       at org.apache.spark.rpc.netty.NettyRpcEnv.$anonfun$openChannel$4(NettyRpcEnv.scala:360)

       at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)

       at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1411)

       at org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:359)

       at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:719)

       at org.apache.spark.util.Utils$.fetchFile(Utils.scala:535)

       at org.apache.spark.executor.Executor.$anonfun$updateDependencies$7(Executor.scala:869)

       at org.apache.spark.executor.Executor.$anonfun$updateDependencies$7$adapted(Executor.scala:860)

       at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:877)

       at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149)

       at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237)

       at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230)

       at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44)

       at scala.collection.mutable.HashMap.foreach(HashMap.scala:149)

       at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:876)

       at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:860)

       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:404)

       at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)

       at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)

       at java.base/java.lang.Thread.run(Thread.java:834)

Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Operation timed out: m298188dljg5j.symc.symantec.com/172.19.129.194:49801

Caused by: java.net.ConnectException: Operation timed out

       at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

       at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779)

       at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)

       at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)

       at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702)

       at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)

       at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)

       at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)

       at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)

       at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)

       at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)

       at java.base/java.lang.Thread.run(Thread.java:834)

Driver stacktrace:

  at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2023)

  at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:1972)

  at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:1971)

  at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)

  at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)

  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)

  at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1971)

  at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:950)

  at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:950)

  at scala.Option.foreach(Option.scala:407)

  at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:950)

  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2203)

  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2152)

  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2141)

  at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)

  at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:752)

  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2093)

  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2114)

  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2133)

  at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:467)

  at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:420)

  at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:47)

  at org.apache.spark.sql.Dataset.collectFromPlan(Dataset.scala:3625)

  at org.apache.spark.sql.Dataset.$anonfun$head$1(Dataset.scala:2695)

  at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3616)

  at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:100)

  at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160)

  at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:87)

  at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:763)

  at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)

  at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3614)

  at org.apache.spark.sql.Dataset.head(Dataset.scala:2695)

  at org.apache.spark.sql.Dataset.take(Dataset.scala:2902)

  at org.apache.spark.sql.execution.datasources.csv.TextInputCSVDataSource$.infer(CSVDataSource.scala:114)

  at org.apache.spark.sql.execution.datasources.csv.CSVDataSource.inferSchema(CSVDataSource.scala:67)

  at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat.inferSchema(CSVFileFormat.scala:62)

  at org.apache.spark.sql.execution.datasources.DataSource.$anonfun$getOrInferFileFormatSchema$11(DataSource.scala:193)

  at scala.Option.orElse(Option.scala:447)

  at org.apache.spark.sql.execution.datasources.DataSource.getOrInferFileFormatSchema(DataSource.scala:190)

  at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:401)

  at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:279)

  at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:268)

  at scala.Option.getOrElse(Option.scala:189)

  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:268)

  at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:705)

  at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:535)

  ... 47 elided

Caused by: java.io.IOException: Failed to connect to m298188dljg5j.symc.symantec.com/172.19.129.194:49801

  at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:253)

  at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:195)

  at org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:392)

  at org.apache.spark.rpc.netty.NettyRpcEnv.$anonfun$openChannel$4(NettyRpcEnv.scala:360)

  at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)

  at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1411)

  at org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:359)

  at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:719)

  at org.apache.spark.util.Utils$.fetchFile(Utils.scala:535)

  at org.apache.spark.executor.Executor.$anonfun$updateDependencies$7(Executor.scala:869)

  at org.apache.spark.executor.Executor.$anonfun$updateDependencies$7$adapted(Executor.scala:860)

  at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:877)

  at scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149)

  at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237)

  at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230)

  at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44)

  at scala.collection.mutable.HashMap.foreach(HashMap.scala:149)

  at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:876)

  at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:860)

  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:404)

  at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)

  at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)

  at java.base/java.lang.Thread.run(Thread.java:834)

Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Operation timed out: m298188dljg5j.symc.symantec.com/172.19.129.194:49801

Caused by: java.net.ConnectException: Operation timed out

  at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

  at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779)

  at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)

  at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)

  at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702)

  at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)

  at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)

  at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)

  at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)

  at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)

  at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)

  at java.base/java.lang.Thread.run(Thread.java:834)
MrPowers commented 3 years ago

@Fungie - can you try ./bin/spark-shell --packages "com.swoop:spark-alchemy_2.12:1.0.1"? That command should let you grab the dependency directly from Maven. Let me know if that solves your issue!