Open sustcoder opened 6 years ago
sbt导入项目时会出现sbt:dump project structure from sbt shell
的过程会非常的慢,sbt shell
窗口会显示busy
,下载依赖的过程比较慢,等待即可,优化方法待续...
直接修改sbt的jar里面的配置文件。windows下可通过360压缩替换掉jar包里面的文件。
D:\ProgramFile\sbt\bin
sbt-launch.jar
为sbt-launch.jar.bak
sbt-launch.jar.bak
,打开个sbt.boot.properties
文件\[repositories\]
里面的local
下面添加以下数据源
alirepo1:https://maven.aliyun.com/repository/central
alirepo2:https://maven.aliyun.com/repository/jcenter
alirepo3:https://maven.aliyun.com/repository/public
sbt-launch.jar
,找到sbt.boot.properties
文件并替换配置sbt的数据源,让其优先加载我们配置的数据源
D:\ProgramFile\sbt\conf
目录下,新建文件repository.properties
repository.properties
中添加以下内容
[repositories]
local
alirepo1:https://maven.aliyun.com/repository/central
alirepo2:https://maven.aliyun.com/repository/jcenter
alirepo3:https://maven.aliyun.com/repository/public
conf/sbtconfig.txt
中添加repository.properties
文件路径
-Dsbt.repository.config=D:/ProgramFile/sbt/conf/repository.properties
执行脚本
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode client /home/hadoop/app/spark2.2.0/examples/jars/spark-examples_2.11-2.2.0.jar
报错信息
Exception in thread "main" org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:85)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:62)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:173)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2509)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:909)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:901)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:901)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
18/09/04 17:01:43 INFO util.ShutdownHookManager: Shutdown hook called
问题原因 todo 解决方法 todo
windows下运行spark-shell报错
xception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream
at org.apache.spark.deploy.SparkSubmitArguments$$anonfun$mergeDefaultSparkProperties$1.apply(SparkSubmitArguments.scala:124)
...
解决方法
在/bin/spark-class2.cmd
文件中添加以下内容
set SPARK_DIST_CLASSPATH=%HADOOP_HOME%\etc\hadoop\*;%HADOOP_HOME%\share\hadoop\common\lib\*;%HADOOP_HOME%\share\hadoop\common\*;%HADOOP_HOME%\share\hadoop\hdfs\*;%HADOOP_HOME%\share\hadoop\hdfs\lib\*;%HADOOP_HOME%\share\hadoop\hdfs\*;%HADOOP_HOME%\share\hadoop\yarn\lib\*;%HADOOP_HOME%\share\hadoop\yarn\*;%HADOOP_HOME%\share\hadoop\mapreduce\lib\*;%HADOOP_HOME%\share\hadoop\mapreduce\*;%HADOOP_HOME%\share\hadoop\tools\lib\*
idea中运行报错
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
解决办法
System.setProperty("hadoop.home.dir", "E:\\data\\gitee\\hadoop-2.6.0")
;winutils.exe
下载地址:https://github.com/srccodes/hadoop-common-2.2.0-bin
,下载后解压,将winutils.exe
拷贝到hadoop的bin目录下。通过spark-submit
或者idea
中提交jar包报错
18/09/28 09:41:52 ERROR TaskSchedulerImpl:Exiting due to error from cluster scheduler: All masters are unresponsive! Giving up.
问题原因
服务器上版本号和本地版本号不对应,导致进行序列化的UID不一致:
class incompatible: stream classdesc serialVersionUID 8789839749593513237, local class serialVersionUID = -4145741279224749316
解决方法
参考链接
现象
java.io.IOException: Incompatible clusterIDs in E:\data\hdfsData\data\datanode:
namenode clusterID = CID-f6ae052a-9745-4710-a3c1-cfab6ef8fc91; datanode clusterID = CID-7bbfeab7-ec7b-4d82-a3f4-e56d53ff7668
原因
执行hdfs namenode -format后,current目录会删除并重新生成,其中VERSION文件中的clusterID也会随之变化,而datanode的VERSION文件中的clusterID保持不变,造成两个clusterID不一致
解决方法
为了避免这种情况,可以再执行的namenode格式化之后,删除datanode的current文件夹,或者修改datanode的VERSION文件中出clusterID与namenode的VERSION文件中的clusterID一样,然后重新启动dfs
the issues in set up spark evn