holgerbrandl / krangl

krangl is a {K}otlin DSL for data w{rangl}ing
MIT License
560 stars 50 forks source link

R alternative: Dataframe.drop_na() #105

Closed TheMrCodes closed 3 years ago

TheMrCodes commented 3 years ago

DataFrame object equivalent function for deleting columns with NA values

https://www.rdocumentation.org/packages/tidyr/versions/0.8.3/topics/drop_na

holgerbrandl commented 3 years ago

Great suggestion. The immediate solution would be

df.filterByRow { !it.values.contains(null)

but to allow providing a column selector I've just added filterNotNull (also see referenced commit for example in tests).

df.filterNotNull() 
df.filterNotNull({ startsWith("user") })

I'm still uncertain about the correct naming here, see https://kotlinlang.slack.com/archives/C4W52CFEZ/p1611263648007500

holgerbrandl commented 3 years ago

I guess this was my most crappy commit to this repo since a long time. Functionally it was fine as it contained the bits described above, but somehow a rebuilt API documentation and other unrelated changes slipped in as well. Sorry for the confusion.

TheMrCodes commented 3 years ago

Personally i would find filterNa more intuitive for someone comming from R, but my vote is for filterNotNull because its mor kotlin like

holgerbrandl commented 3 years ago

On kotlin slack it was argued that Double.POSITIVE_INFINITY is usually considered too NA (also in R afaik), but would/should not be covered by filterNotNull() (which is also still my preferred name here)

TheMrCodes commented 3 years ago

Good, does that means that fillterNotNull only filters out Null Values?

This would be no problem for my use case. In my opinion krangl don't has to be an exact replica of R and Python functionality

holgerbrandl commented 3 years ago

With it's current implementation yes.

TheMrCodes notifications@github.com schrieb am Fr., 22. Jan. 2021, 20:27:

Good, does that means that fillterNotNull only filters out Null Values?

This would be no problem for my use case. In my opinion krangl don't has to be an exact replica of R and Python functionality

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/holgerbrandl/krangl/issues/105#issuecomment-765634934, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABRB6ETPJTOV6WPH555YXTS3HGRDANCNFSM4WLOUJVQ .