When I ran myspark to collect the log
$ myspark --spec=cond.spec --script=RecordFinder --records-output=records.json
I encountered this error:
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.newAPIHadoopFile. : org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://analytix/cms/wmarchive/avro/2017/06/30
The correct path should be:
hdfs://analytix/cms/wmarchive/avro/fwjr/2017/06/30
Could you please update this in the source code? Thanks!
Hi,
When I ran myspark to collect the log
$ myspark --spec=cond.spec --script=RecordFinder --records-output=records.json
I encountered this error:
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.newAPIHadoopFile. : org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://analytix/cms/wmarchive/avro/2017/06/30
The correct path should be:
hdfs://analytix/cms/wmarchive/avro/fwjr/2017/06/30
Could you please update this in the source code? Thanks!