1086-Maria-Big-Data / JobAdAnalytics

3 stars 2 forks source link

Implemented writeParquet in IndexUtil #83

Closed vinceecws closed 3 years ago

vinceecws commented 3 years ago

Assuming saved path="/base/path" & partition_cols=Seq("year", "month"), to read the Parquet for September 2020:

df = spark.sqlContext
    .read
    .option("basePath", "/base/path")
    .parquet("/base/path/year=2020/fetch_month=9")

Similarly, to read the entire Parquet:

df = spark.sqlContext
     .read
     .option("basePath", "/base/path")
     .parquet("/base/path/")