Dear mahmoudparsian,
Sorry to bother you.
Actually, it is known that two methods can be used in the propose of saving output when one scala-spark program finishes. As you do in the "NaiveBayesClassifierBuilder.scala", the pt table saved as part-* file in the HDFS. However, my issue is relative to this. RDD's method,called saveAsObjectFile, will return NULL first, and with a sequenceFile output second. Thus, in the second spark program (NaiveBayesClassifier.scala), a NullPointerException throws. In the another hand, if i use saveAsTextFile, the second spark program will show a exception that "A sequenceFile is required". Thus, I'm not sure how to deal with this issue in your scala programme. Could you give me any tips?
Dear mahmoudparsian, Sorry to bother you. Actually, it is known that two methods can be used in the propose of saving output when one scala-spark program finishes. As you do in the "NaiveBayesClassifierBuilder.scala", the pt table saved as part-* file in the HDFS. However, my issue is relative to this. RDD's method,called saveAsObjectFile, will return NULL first, and with a sequenceFile output second. Thus, in the second spark program (NaiveBayesClassifier.scala), a NullPointerException throws. In the another hand, if i use saveAsTextFile, the second spark program will show a exception that "A sequenceFile is required". Thus, I'm not sure how to deal with this issue in your scala programme. Could you give me any tips?
Best Wishes, WeiWei HE