joelgrus / data-science-from-scratch

code for Data Science From Scratch book
MIT License
8.63k stars 4.5k forks source link

Definition and implementation of the quantile #71

Open Florimond opened 5 years ago

Florimond commented 5 years ago

A generalization of the median is the quantile, which represents the value under which a certain percentile of the data lies (the median represents the value under which 50% of the data lies):

def quantile(xs: List[float], p: float) -> float:
    """Returns the pth-percentile value in x"""
    p_index = int(p * len(xs))
    return sorted(xs)[p_index]

I think this means the median should be equal to quantile(xs, 0.5), which is not the case.

FDV79 commented 4 years ago

A generalization of the median is the quantile, which represents the value under which a certain percentile of the data lies (the median represents the value under which 50% of the data lies):

def quantile(xs: List[float], p: float) -> float:
    """Returns the pth-percentile value in x"""
    p_index = int(p * len(xs))
    return sorted(xs)[p_index]

I think this means the median should be equal to quantile(xs, 0.5), which is not the case.

You are right, as for the median, we have to write different functions for the even and odd cases and combine them (or to use a condition in our function).