import pandas as pd
from skfolio.optimization import HierarchicalRiskParity
from skfolio.moments import DenoiseCovariance
from skfolio.distance import CovarianceDistance
from skfolio import RiskMeasure
df = pd.read_parquet("X_before.parquet")
model = HierarchicalRiskParity(risk_measure=RiskMeasure.MEAN_ABSOLUTE_DEVIATION,portfolio_params={"name": "HRP-MAD-Ward-DenoisedPearson"},distance_estimator=CovarianceDistance(covariance_estimator=DenoiseCovariance()))
model.fit(df)
Expected behavior
I believe this comes from the code here which defines max clusters at a minimum as 8, whereas the amount of columns could be less than 8.
Additional context
If HRP shouldn't be used for less than 8 return columns this could be exposed and allowed to be checked for by the consumer to avoid catching the IndexError. Otherwise I would suggest taking off the 8 max and changing it to the length of the columns array.
Describe the bug
IndexError is raised when computing the maximum amount of clusters when the amount of columns in the returns dataframe is low.
To Reproduce
X_before.parquet.zip
Expected behavior
I believe this comes from the code here which defines max clusters at a minimum as 8, whereas the amount of columns could be less than 8.
Additional context
If HRP shouldn't be used for less than 8 return columns this could be exposed and allowed to be checked for by the consumer to avoid catching the
IndexError
. Otherwise I would suggest taking off the 8 max and changing it to the length of the columns array.Versions 0.2.1