ETA444 / datasafari

DataSafari simplifies complex data science tasks into straightforward, powerful one-liners.
https://datasafari.dev
GNU General Public License v3.0
2 stars 0 forks source link

Write NumPy docstring for calculate_mahalanobis() #15

Closed ETA444 closed 5 months ago

ETA444 commented 5 months ago

Written and accessible:

help(calculate_mahalanobis)

This solution addresses the issue "Write NumPy docstring for calculate_mahalanobis()" by providing a detailed NumPy-style docstring for the calculate_mahalanobis() function.

Summary:

The function calculate_mahalanobis() calculates the Mahalanobis distance for an observation from a distribution. The updated docstring follows the NumPy format and includes details on the parameters, return values, exceptions, examples, and notes.

Docstring Sections Preview:

Description

"""
Calculate the Mahalanobis distance for an observation from a distribution.

The Mahalanobis distance is a measure of the distance between a point and a distribution.
It is an effective way to determine how many standard deviations an observation is from
the mean of a distribution, considering the covariance among variables. This function
computes the Mahalanobis distance of a single observation from the mean of a distribution,
given the inverse of the covariance matrix of the distribution.
"""

Parameters

"""
Parameters
----------
x : numpy.ndarray or pandas.Series
    A 1D array of the observation or a single row from a DataFrame.
mean : numpy.ndarray
    The mean vector of the distribution from which distances are calculated.
    Must be 1D and of the same length as `x`.
inv_cov_matrix : numpy.ndarray
    The inverse of the covariance matrix of the distribution. This matrix
    must be square and its size should match the number of elements in `x`.
"""

Returns

"""
Returns
-------
float
    The Mahalanobis distance of the observation `x` from the distribution
    defined by `mean` and `inv_cov_matrix`.
"""

Raises

"""
Raises
------
ValueError
    If `x` and `mean` do not have the same length.
LinAlgError
    If the inverse covariance matrix is singular and cannot be used for
    distance calculation.
"""

Examples

"""
Examples
--------
>>> import numpy as np
>>> mean_vector = np.array([0, 0])
>>> observation = np.array([1, 1])
>>> cov_matrix = np.array([[1, 0.5], [0.5, 1]])
>>> inv_cov_matrix = np.linalg.inv(cov_matrix)
>>> calculate_mahalanobis(observation, mean_vector, inv_cov_matrix)
2.0
"""

Notes

"""
Notes
-----
The Mahalanobis distance is widely used in outlier detection and cluster analysis.
It is scale-invariant and takes into account the correlations of the data set.
"""