strongio / foundry

MIT License
3 stars 0 forks source link

Add `case_when` function/method #9

Open andywong36 opened 2 years ago

andywong36 commented 2 years ago

https://www.rdocumentation.org/packages/dplyr/versions/1.0.10/topics/case_when

As an alternative to

def get_bound_status(df: pd.Series) -> str: 
""" returns one of ['upper', 'lower', 'none'] """
  ...

model_matrix = (
  model_matrix
  .assign(bounded_by = lambda df: df.apply(get_bound_status, axis=1))
)
m-clark commented 1 year ago

@jwdink Since we're meeting later, I thought I'd throw this out there.

My thoughts for an initial approach here would be to simply have a wrapper for np.select as follows.

def case_when(condlist, choicelist, default = np.nan()):
    # length/other checks
    return(np.select(condlist, choicelist, default))
image

But it would be nice to make it more dplyr-esque with something as follows. If the syntax is kept simple, arbitrary symbol + commas for separation of parts within and between, this maybe could be done.

x == 2 -> 'A', 
y == 1 -> 'B'
jwdink commented 1 year ago

I'll admit I forgot about the existence of np.select when this issue was created.

I was thinking the argument structure would be more like dplyr's, though:

example = pd.Series(range(20))

case_when(
    (example == 0, '0'),
    (example == 1, '1'),
    (example <= 7, '<=7'),
    (True, '7')
)

An implementation using np.select would look like:

def case_when(*args, default=np.nan) -> np.ndarray:
    """
    :param args: Tuples, the first element being the condition, the second being the value if that condition is 
     satisfied.
    :param default: The default value when no conditions are met.
    :return: An ndarray
    """
    condlist, choicelist = zip(*args)
    return np.select(condlist, choicelist, default)