PacktPublishing / Distributed-Data-Systems-with-Azure-Databricks

Distributed Data Systems with Azure Databricks, published by Packt
MIT License
12 stars 10 forks source link

Chapter 8: Page 252 & 253 #3

Open tanthiamhuat opened 2 years ago

tanthiamhuat commented 2 years ago

it is actually quite difficult to follow through the code in the book if there are so many mistakes.

un= trades.filter(col("symbol") == "K").select('event_ ts', 'price', 'symbol', 'bid', 'offer', 'ind_cd'). union(quotes.filter(col("symbol") == "K").select('event_ ts', 'price', 'symbol', 'bid', 'offer', 'ind_cd'))

I am sure there is no row with the symbol = 'K' and there is no such column named 'bid', 'offer', 'ind_cd'

Is the book actually proof-read?

DataSpacon commented 1 year ago

There are some rows with symbol "K". However this is completely wrong code, it looks like is filtering the df, and later in the chapter is grouping by this column. Obviously you cannot make an union using samples in the book as column names are different, unless you will match the structure.

tanthiamhuat commented 1 year ago

yes, please make it correct, else readers have difficulty following through it, hope you understand. And someone needs to read-proof to reduce such mistakes, please.