Open sleak-lbl opened 5 years ago
having just hit "submit", I realized that using dropna()
in this case is better than messing around with notnull()
and making a mask .. but I still think a strong signal in method names for when the return type is not the same as the input type, is valuable
A pattern that has repeatedly caught me out is stringing operations together like:
df['myfield'].notnull().unique()
.. the error here is thatnotnull()
returns a mask rather than a slice of the dataframe or series. Having most operations return a dataframe/series with the same signature, and ones that don't being more obvious, would probably ease the learning curve and help to avoid some user code errors, eg:df['myfield'].notnull() # returns a series of same dtype as df['myfield'], with the N/A rows dropped
df['myfield'].is_notnull() # returns a series of dtype boolean to use as a mask
"monad-ish" because operations generally return an object of the same type.
This would unfortunately cause hard-to-find-in-user-code changes to the API