Closed HALF111 closed 3 months ago
Thanks for the contribution! Before we can merge this, we need @HALF111 to sign the Salesforce Inc. Contributor License Agreement.
Hi @gorold, I am curious why we need indexer
https://github.com/SalesforceAIResearch/uni2ts/tree/main/src/uni2ts/data/indexer. Could you help to add some comments on https://github.com/SalesforceAIResearch/uni2ts/blob/main/src/uni2ts/data/indexer/hf_dataset_indexer.py? Is it mainly about assigning different sampling probabilities for different time series?
The Indexer class helps to get the data from the underlying file format. The Dataset classes are responsible for deciding how to sample the time series but the Indexer class provides the probabilities.
Hi @gorold , I would like to confirm that the indexer class provides the probabilities for different time series within a dataset. But the dataset_weight
in https://github.com/SalesforceAIResearch/uni2ts/blob/main/src/uni2ts/data/dataset.py controls the sampling probability of each dataset?
Also a followup question, why we change the __len__
with dataset_weight
: https://github.com/SalesforceAIResearch/uni2ts/blob/adf72061666813456200f3bf083c8389eaca4bfe/src/uni2ts/data/dataset.py#L80
I guess we try to modify the length of a dataset to let the sampler directly handle the sampling probabilities? Could you please add a few comments/documentations there?
Yup thats correct.
Add simple comments on model/moirai in source code. This is a start, and we will continue to comment on all the source code and submit Pull Requests later.