huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
132.77k stars 26.46k forks source link

Time Series Transformer - Dynamic Categorical Features #24695

Open guyko81 opened 1 year ago

guyko81 commented 1 year ago

Feature request

I would like to have a Dynamic Categorical Feature Embedding option in TimeSeriesTransformerConfig

Motivation

I didn't see any option in the TimeSeriesTransformerConfig where I could define an embedding of a Dynamic Categorical Feature. I'm working with sales data and holiday is an important element of sales, so all of my models handle the holidays with a dynamic embedding. Is it the case in Time Series Transformer too, and I'm just missing something?

Your contribution

Happy to help, but would need some guidance on how it's handled currently.

ydshieh commented 1 year ago

cc @kashif

kashif commented 1 year ago

@guyko81 yes sure! I would be happy to help you get this done. I never found a good example of dynamic categorical features, so if you have some sample example that would be really helpful.

We can assume that the dataset has a key e.g.

dynamic_static_categorical  = [ [0, 2, 555], [23, 5, 66], ... [33, 4, 54]]

where we have a list of categories for each time point where the len of this array will be the length of the target values array in the time dim.

Next we will need to specify the number of dynamic cat. features (3) in the example above and the cardinalities and dims of the corresponding features:

dynamic_cat_card = [50, 10, 1000]
dynamic_cat_dimns = [12, 16, 32]

Once we have that done on the config side we can just add a corresponding nn.Embedding and concat the outputs to the input vector. If you open a PR please CC me and then i can help out!

Thank you!

guyko81 commented 1 year ago

@kashif I have created a pull request https://github.com/huggingface/transformers/pull/24712 Still need to test it first, but I wanted you to have a look