Simple comments on model/moirai in source code

SalesforceAIResearch / uni2ts

[ICML2024] Unified Training of Universal Time Series Forecasting Transformers

Apache License 2.0

796 stars 80 forks source link

Simple comments on model/moirai in source code #69

Closed HALF111 closed 3 months ago

HALF111 commented 3 months ago

Add simple comments on model/moirai in source code. This is a start, and we will continue to comment on all the source code and submit Pull Requests later.

salesforce-cla[bot] commented 3 months ago

Thanks for the contribution! Before we can merge this, we need @HALF111 to sign the Salesforce Inc. Contributor License Agreement.

liu-jc commented 3 months ago

Hi @gorold, I am curious why we need indexerhttps://github.com/SalesforceAIResearch/uni2ts/tree/main/src/uni2ts/data/indexer. Could you help to add some comments on https://github.com/SalesforceAIResearch/uni2ts/blob/main/src/uni2ts/data/indexer/hf_dataset_indexer.py? Is it mainly about assigning different sampling probabilities for different time series?

gorold commented 3 months ago

The Indexer class helps to get the data from the underlying file format. The Dataset classes are responsible for deciding how to sample the time series but the Indexer class provides the probabilities.

liu-jc commented 3 months ago

Hi @gorold , I would like to confirm that the indexer class provides the probabilities for different time series within a dataset. But the dataset_weight in https://github.com/SalesforceAIResearch/uni2ts/blob/main/src/uni2ts/data/dataset.py controls the sampling probability of each dataset? Also a followup question, why we change the __len__ with dataset_weight: https://github.com/SalesforceAIResearch/uni2ts/blob/adf72061666813456200f3bf083c8389eaca4bfe/src/uni2ts/data/dataset.py#L80 I guess we try to modify the length of a dataset to let the sampler directly handle the sampling probabilities? Could you please add a few comments/documentations there?

gorold commented 3 months ago

Yup thats correct.