What is the meaning of split

Felix1014 commented 3 years ago

Dear authors，

What is the meaning of split? There are many .bundle files in the /data/split, and can you explain this for me?

Thank you!

yabufarha commented 3 years ago

Hi @Felix1014 ,

Split refers to the way the videos from the datasets are divided into training and testing sets. For a relatively small dataset, more than one split are usually used to evaluate the model and the final result is the average of the performance over all splits.

I hope this would help.

Felix1014 commented 3 years ago

Thank you for your reply. But I am still confused. Specifically, When I run main.py --action=train --dataset=DS --split=SP, if SP=1 or 2, how is the dataset divided into the training set and test set, and what do these .bundle files mean? And what is the relationship between SP and the .bundle files in data/50salads/splits. I really do not understand this.

Besides, How can I generate these .bundle files if I want to use another dataset?

Finally, if a dataset has standard training and test sets and we do not need to adopt cross-validation. What can I do to revise the codes?

Hi @Felix1014 ,

Split refers to the way the videos from the datasets are divided into training and testing sets. For a relatively small dataset, more than one split are usually used to evaluate the model and the final result is the average of the performance over all splits.

I hope this would help. Thank you for your reply. But I am still confused. Specifically, When I run main.py --action=train --dataset=DS --split=SP, if SP=1 or 2, how is the dataset divided into the training set and test set, and what do these .bundle files mean? And what is the relationship between SP and the .bundle files in data/50salads/splits. I really do not understand this.

Besides, How can I generate these .bundle files if I want to use another dataset?

Finally, if a dataset has standard training and test sets and we do not need to adopt cross-validation. What can I do to revise the codes?

yabufarha commented 3 years ago

The .bundle files contain the list of examples for each split. We do not generate those files and they are the standard training and testing set for the used datasets. If you want to test the code on split 1 of 50salads, for example, then you need to run python main.py --action=train --dataset=50salads --split=1. The code uses these parameters to access to corresponding training and testing examples as listed in the .bundle files.

If you want to use another dataset, you only need to pass the corresponding list of training and testing examples (and maybe update the features dimension if you are using different features).

Felix1014 commented 3 years ago

got it. Thank you so much @yabufarha.

yabufarha / ms-tcn

What is the meaning of split #32