fortran-lang / stdlib

Fortran Standard Library
https://stdlib.fortran-lang.org
MIT License
1.09k stars 167 forks source link

Give stats functions optional mean arguments? #374

Open Beliavsky opened 3 years ago

Beliavsky commented 3 years ago

The corr, cov, and var functions in the descriptive statistics section of stdlib all require the calculation of means as an intermediate step. Often a user will have already calculated the mean(s) of the data, so it would be efficient if corr, cov, and var allowed a precomputed mean as an optional argument. Another reason for having an optional mean argument is that the user may have a view of what the mean is on theoretical grounds. For example, options traders commonly compute the historical volatilities (standard deviation) and correlations of asset returns using the last 20 days. Since 20 observations is too few to estimate the mean, and since the expected daily stock return of say 0.10/252 (10% annual returns with 252 trading days per year) is small, commonly the mean is set to zero to calculate standard deviations and correlations.

Similarly, if I want to calculate the correlation of two vectors x(:) and y(:), it is possible that I have already computed both the means and standard deviations or variance of x and y, so perhaps there should be optional INTENT(IN) arguments for standard deviations or variances.

jvdp1 commented 3 years ago

Some of your propositions are already possible with the moment function available in stdlib. For example, var with a precomputed mean is equivalent to the 2-nd order moment about a array center. These could probably be extended to corr and cov easily.