Questions about the entrance of the training function

Sorry for the slow reply!

a) I usually run it like this:

python -c "import sep_params, sourcesep; sourcesep.train(sep_params.full(num_gpus=3), [0, 1, 2], restore = False)"

This uses the "full" parameter set defined in sep_params.py

b) I kept only these categories from the Kinetics dataset: blowing nose bowling chopping wood ripping paper shuffling cards singing tapping pen using computer blowing out candles dribbling basketball laughing mowing lawn shoveling snow stomping grapes tap dancing tapping guitar tickling strumming guitar playing accordion playing bagpipes playing bass guitar playing clarinet playing drums playing guitar playing harmonica playing keyboard playing organ playing piano playing saxophone playing trombone playing trumpet playing violin playing xylophone

following this paper by Arandjelovic and Zissserman (note that the list of categories in their paper is slightly out of date, though, since it used a pre-release version of the Kinetics dataset).

c) The "pr" variable is the parameter set. You can find examples of these in sep_params.py, such as "full" (the full model), and "unet_pit" (only sound).

Hope that helps!

andrewowens / multisensory

Questions about the entrance of the training function #14