Open xuewei4d opened 5 years ago
want to just open a pull request if you're seeing something wrong?
@xuewei4d ===> I ran the below command - dfWithDate.printSchema() resulted in Quantity as an IntegerType root |-- InvoiceNo: string (nullable = true) |-- StockCode: string (nullable = true) |-- Description: string (nullable = true) |-- Quantity: integer (nullable = true) |-- InvoiceDate: string (nullable = true) |-- UnitPrice: double (nullable = true) |-- CustomerID: integer (nullable = true) |-- Country: string (nullable = true) |-- date: date (nullable = true) ===> orderBy("CustomerId") or orderBy($"Quantity".desc) resulted in the same result.
https://github.com/databricks/Spark-The-Definitive-Guide/blob/38e881406cd424991a624dddb7e68718747b626b/code/Structured_APIs-Chapter_7_Aggregations.scala#L171
First, the column
Quantity
is parsed asString
. It should beIntegerType
.Second
orderBy(CustomerId)
should beorderBy(desc(Quantity))