sorenmacbeth / flambo

A Clojure DSL for Apache Spark
Eclipse Public License 1.0
606 stars 84 forks source link

Left Outer Join documentation is inaccurate #107

Closed philipnee closed 7 years ago

philipnee commented 7 years ago

when performing left outer join on (K,V) (K, W), it returns an Optional per JavaPairRDD Doc

I believe you have to apply .orNull to the results to get (K, (V,nil)) if W is missing.

see Optional doc here: https://google.github.io/guava/releases/19.0/api/docs/com/google/common/base/Optional.html

sorenmacbeth commented 7 years ago

you are correct that you need to call .orNull. What documentation in flambo are you referring to that needs updating? Would you be will to submit a PR?

sorenmacbeth commented 7 years ago

I guess you are probably referring to the docstring for flambo.api/left-outer-join?

I don't think the need to call .orNull needs to be called out in the docstring. We do that in the test as you can see here:

https://github.com/yieldbot/flambo/tree/develop/test/flambo/api_test.clj#L147

However, the api call itself doesn't require that you call .orNull on the result on not. In practice you will.

I would definitely accept a PR noting that an Optional is returned and calling .orNull is required to retrieve that value.

philipnee commented 7 years ago

I'm referring to the readme under the base of this repo.

And yes, you don't need to call .orNull, but it wouldn't return nil by default.

sorenmacbeth commented 7 years ago

closed by 8989f76115f60596ec1857a96ace225cc2970259