sryza / aas

Code to accompany Advanced Analytics with Spark from O'Reilly Media
Other
1.52k stars 1.03k forks source link

refactoring the longform function #130

Closed reynoldsm88 closed 5 years ago

reynoldsm88 commented 5 years ago

Hey what do you think of this? I am working through the book now and even with the description I found it tough to follow the transformation that was happening. I reimplemented it myself as an exercise to understand the transformation better.

I've not seen the (1 until row.size).map( ... ) style expression before, so in mine I used pattern matching on the list, which I see a lot in other codebases.

There's no difference in performance and I extracted the output from the result and it's identical. I've attached both. To reproduce these documents just run with spark-submit and then find and replace on [0-9]+-[0-9]+-[0-9]+.*\n. Original Output, Updated Output

reynoldsm88 commented 5 years ago

@srowen it's totally up to you if you want to leave the code as the same as in the book. I just found rewriting it this way to be extremely helpful in understanding that part. I don't mind either way I was just running it by as an idea.

reynoldsm88 commented 5 years ago

Whoa cool, glad you approved of the contribution :)

Great book by the way!!!