Open jlabhishek opened 5 years ago
Yes, putting the chunk duration explicitly in the state space should help the training. This part of evaluation is synthetic, we were just adding a multiplicative noise to the chunk size. We didn't find a dataset for the distribution of chunk duration. Let us know if you find one?
I trained on a synthetic data and noticed that there is no single model which outperforms other models for all different video chunk sizes, like one model performs better(mean reward on test set) for 4s chunk size while other performs better than this one for 7s chunk size. Can you suggest something to get a single model or this kind of behavior should be expected?
Hi, Under Section 5.3 of the paper, in Multiple videos paragraph. It is stated : chunk sizes were computed by multiplying the standard 4-second chunk size with Gaussian noise ∼ N (1, 0.1). Can you kindly share a reference for these standard sizes or how you got them?. Also how would you recommend training if chunk durations differ for videos, like it varies from 4 seconds to 10 seconds over whole training set. Do we need to feed chunk duration as a state Variable then?
Thank you.