about TACoS data split - Githubissues

sangminwoo commented 3 years ago

Hi @Sy-Zhang! Appreciate for providing a nicely organized codebase.

I am confused about the data split regarding the TACoS dataset. While your paper indicates that it follows the data split of TALL (Gao et al. 2017), I found they are not the same. The data split in TALL is 50:25:25 (proportion) and your code is 75/27/25 (actual number), which is obviously different. It would be more clear if you clarify the one practitioners to follow.

Many thanks.

Sy-Zhang commented 3 years ago

Hi @Sy-Zhang! Appreciate for providing a nicely organized codebase.

I am confused about the data split regarding the TACoS dataset. While your paper indicates that it follows the data split of TALL (Gao et al. 2017), I found they are not the same. The data split in TALL is 50:25:25 (proportion) and your code is 75/27/25 (actual number), which is obviously different. It would be more clear if you clarify the one practitioners to follow.

Many thanks.

If you check the annotation files in TALL, there should be 10146, 4589, 4083 samples in the training, validation and testing sets.

sangminwoo commented 3 years ago

Oh, I see... I got it thanks!

microsoft / VideoX

about TACoS data split #41