GAA-UAM / scikit-fda

Functional Data Analysis Python package
https://fda.readthedocs.io
BSD 3-Clause "New" or "Revised" License
308 stars 58 forks source link

multivariate anova #258

Open alejandro-ariza opened 4 years ago

alejandro-ariza commented 4 years ago

Hi there, me again, XD.

Is there any alternative to perform analysis of variance on multivariate functions, until the feature below is implemented? Problem illustrated with a random dataset example:

from skfda.datasets import fetch_aemet
from skfda.inference.anova import oneway_anova

# get basis representation of multivariate functional data (temperature, 
# precipitation, and wind), through 73 meteorological stations         
fdgrid     = fetch_aemet()['data']

# take 2 random sets of samples, with 10 meteorological stations each, and
# perform a one way anova to see if they are significantly different
fdgrid1    = fdgrid[range(10, 20)]
fdgrid2    = fdgrid[range(60, 70)]
results    = oneway_anova(fdgrid1, fdgrid2)

# returns: NotImplementedError: Covariance only implemented for univariate function

Thanks!

vnmabus commented 4 years ago

Unfortunately, I do not know any alternatives for multivariate functions (as I said, I think this is not as explored as univariate functions, not only in scikit-fda, but in FDA in general).

In order for this to work we would need at least to implement the covariance function for vector-valued functions, and make_gaussian_process should be able to generate vector-valued Gaussian process (assuming that this makes sense).

For now maybe you can try analyzing each coordinate function separately (you can use fdgrid.coordinates[0] to get the temperature, fdgrid.coordinates[1] to get the precipitation, etc).

alejandro-ariza commented 4 years ago

Ok, looking to much ahead, XD.

The solution you proposed might help in the meantime.

Thanks Carlos!