databricks / Spark-The-Definitive-Guide

Spark: The Definitive Guide's Code Repository
http://shop.oreilly.com/product/0636920034957.do
Other
2.85k stars 2.76k forks source link

Chapter 7 Wrong code in the window function example #39

Open xuewei4d opened 5 years ago

xuewei4d commented 5 years ago

https://github.com/databricks/Spark-The-Definitive-Guide/blob/38e881406cd424991a624dddb7e68718747b626b/code/Structured_APIs-Chapter_7_Aggregations.scala#L171

First, the column Quantity is parsed as String. It should be IntegerType.

Second orderBy(CustomerId) should be orderBy(desc(Quantity))

bllchmbrs commented 5 years ago

want to just open a pull request if you're seeing something wrong?

shanmugavel04 commented 3 years ago

@xuewei4d ===> I ran the below command - dfWithDate.printSchema() resulted in Quantity as an IntegerType root |-- InvoiceNo: string (nullable = true) |-- StockCode: string (nullable = true) |-- Description: string (nullable = true) |-- Quantity: integer (nullable = true) |-- InvoiceDate: string (nullable = true) |-- UnitPrice: double (nullable = true) |-- CustomerID: integer (nullable = true) |-- Country: string (nullable = true) |-- date: date (nullable = true) ===> orderBy("CustomerId") or orderBy($"Quantity".desc) resulted in the same result.