blue-yonder / tsfresh

Automatic extraction of relevant features from time series:
http://tsfresh.readthedocs.io
MIT License
8.43k stars 1.21k forks source link

number_of_maxima only returns maxima of last key in the dictionary. #1072

Open siddij3 opened 6 months ago

siddij3 commented 6 months ago

I ran kind_to_fc_parameters to extract custom parameters from the extract_features() function with a set of parameters, and in them was the following:

                "mean_n_absolute_max": [
                    {
                        "number_of_maxima": 7,
                        "number_of_maxima": 9,
                        "number_of_maxima": 11,
                        "number_of_maxima": 13,
                        "number_of_maxima": 15,
                        "number_of_maxima": 17,
                        "number_of_maxima": 19,
                        "number_of_maxima": 21,
                        "number_of_maxima": 23,
                    }
                ],

When looking at the returned columns, it only returned number_of_maxima_23 for all the timeseries columns.

Intial timeseries columns:

datetime  C_DIOXIDE  AMMONIA  H_SULFIDE  OXYGEN  TEMPERATURE

Extracted:

C_DIOXIDE-mean_n_absolute_max-number_of_maxima_23  AMMONIA-mean_n_absolute_max-number_of_maxima_23  OXYGEN-mean_n_absolute_max-number_of_maxima_23 etc. 

Environment:

nils-braun commented 5 months ago

Hi @siddij3 (sorry for the super late response! ) I do not know where you got this specific list from (maybe you compiled it yourself?), but for usage in the kind_to_fc_parameters, this list has the wrong format. A your basically have a single dictionary for all the number_of_maxima settings, python will go ahead and combine them all into one key (you can go ahead and look at the resulting settings e.g. with a print statement and you will see, that even before you pass it to tsfresh, there is only a single entry (the last one) in it). The format expected by tsfresh is a list of dictionaries. So for your use case:

settings = {
    "mean_n_absolute_max": [
        {
            "number_of_maxima": 7,
        },
        {
            "number_of_maxima": 9,
        },
        {
            "number_of_maxima": 11,
        },
        {
            "number_of_maxima": 13,
        },
        {
            "number_of_maxima": 15,
        },
        {
            "number_of_maxima": 17,
        },
        {
            "number_of_maxima": 19,
        },
        {
            "number_of_maxima": 21,
        },
        {
            "number_of_maxima": 23,
        }
    ],
}

With this, it will work as expected.

siddij3 commented 5 months ago

Thanks for replying! I compiled the list myself after filtering through the features based on their results and the physical implications of the features. I completely overlooked the dictionary key overlapping. Thanks for pointing that out!