Open tonykang22 opened 1 year ago
val lines = sc.textFile(/logs/*.log)
lines.filter(x => x.contains("ERROR")).count()
JavaRDD<String> lines = sc.textFile("/logs/*.log");
lines.filter(x -> x.contains("ERROR")).count();
lines = sc.textFile(/logs/*.log)
lines.filter(lambda x: "ERROR" in x).count()
http://<Driver Node>:4040
http://<History Server>:18080
http://<Standalone Master>:8080
http://<ResouceManager>:8088
2014 Daytona Gray Sort 100TB Benchmark
2016 CloudSort Benchmark (Cost to sort 100TB of data)
Spark 개요
Apache Spark
Apache Spark란?
특징
RDD(Resilient Distributed Dataset)
RDD : 생성 > 변형 > 연산