andrewowens / multisensory

Code for the paper: Audio-Visual Scene Analysis with Self-Supervised Multisensory Features
http://andrewowens.com/multisensory/
Apache License 2.0
220 stars 60 forks source link

Question about training #9

Open Lugangz opened 6 years ago

Lugangz commented 6 years ago

It's really an amazing job. It seems you didn't share the codes for training, such as getting cams, action recognition, and auido-vision separation. I don't know how to train the models, could you add the codes for training.

Askdeep commented 5 years ago

Following this issue!

andrewowens commented 5 years ago

In case it helps, the training code is all there (see the train() function in sourcesep.py and shift_net.py). You'll just have to rewrite the I/O code. This involves rewriting the read_data function to read a batch of data from your dataset. My own I/O code uses TFRecord files, and I've provided it here as well (albeit without documentation). It'd probably be easier to just rewrite it, though.

ASHA-KOTERU commented 5 years ago

In the source separation model it seems like you are using *.tf files as input (rec_files_from_path in sep_dset.py).Can you please provide the format to create those TFRecord files

yxixi commented 5 years ago

In the source separation model it seems like you are using *.tf files as input (rec_files_from_path in sep_dset.py).Can you please provide the format to create those TFRecord files

hi! do you know how to train the model now? RELLY need some help! THX!

yxixi commented 5 years ago

It's really an amazing job. It seems you didn't share the codes for training, such as getting cams, action recognition, and auido-vision separation. I don't know how to train the models, could you add the codes for training.

Sorry to bother you.Do you know the right way to train the models successfully now?

reddyanilkumar commented 5 years ago

Can you please provide link to dataset you used for training.? Could you also provide steps to retrain new dataset.?

ASHA-KOTERU commented 5 years ago

In the source separation model it seems like you are using *.tf files as input (rec_files_from_path in sep_dset.py).Can you please provide the format to create those TFRecord files

hi! do you know how to train the model now? RELLY need some help! THX!

import sourcesep, sep_params clip_dur=2.135 fn = getattr(sep_params, 'full') pr = fn(vid_dur = clip_dur) sourcesep.train(pr,0,False,False,False)

yxixi commented 5 years ago

In the source separation model it seems like you are using *.tf files as input (rec_files_from_path in sep_dset.py).Can you please provide the format to create those TFRecord files

hi! do you know how to train the model now? RELLY need some help! THX!

import sourcesep, sep_params clip_dur=2.135 fn = getattr(sep_params, 'full') pr = fn(vid_dur = clip_dur) sourcesep.train(pr,0,False,False,False)

Thank you for your help! I noticed TFRecords were used to train this model .Do you know how to create TFRecords files? Looking forward to your reply!

xuanhanyu commented 5 years ago

After I read the comments above, I noticed that the author said need to rewrite the I/O code. If I rewrite the I/O code, Should I read video and audio data separately, and then fed to two branch networks ?

xuanhanyu commented 5 years ago

When I rewrite the I/O code, there are those details that need to be noticed.