hyf015 / egocentric-gaze-prediction

Code for the paper "Predicting Gaze in Egocentric Video by Learning Task-dependent Attention Transition"
62 stars 18 forks source link

Pretrained models #12

Closed Tushar-N closed 5 years ago

Tushar-N commented 5 years ago

Would it be possible to provide pretrained models from your experiments which can be used directly for evaluation? A script to use the pretrained models to generate saliency maps for a video would also be very helpful.

Thanks for putting up your code! :)

hyf015 commented 5 years ago

Thank you for your interest. Actually, our model is trained separately, so I think it would be confusing to put each pre-trained model and intermediate outputs. I'll first upload a pre-trained model for saliency prediction only and it would be easy to use.

kazucmpt commented 5 years ago

Good! It will help many researchers.

Tushar-N commented 5 years ago

I'll first upload a pre-trained model for saliency prediction only and it would be easy to use.

Great, thank you!

Tushar-N commented 5 years ago

Just wanted to check in on this issue. Has there been any progress?

hyf015 commented 5 years ago

You can see the model in readme now. I haven't prepared a script for using it yet, but that should be simple.

Tushar-N commented 5 years ago

Thanks again! I really appreciate the help. I am having a bit of trouble with inference though. I can load the model just fine, but there are details about preprocessing and inputs that aren't obvious to be from the scripts, and it would be helpful if a single script that just has the skeleton with an example is provided.

from SP import SP
sp = SP(pretrained_spatial='pretrained_sp.pth.tar',  pretrained_temporal='save/pretrained_sp.pth.tar')
sp.model.load_state_dict(torch.load('pretrained_sp.pth.tar')['state_dict'])

This is as far as I got before I realized I don't know what the expected inputs are

# is rgb a (B, 3, 224, 224) tensor?
# is Flow a (B, 2, 224, 224) tensor?
output = sp.model(rgb, flow) 

Thanks!

hyf015 commented 5 years ago

Flow is a (B, 20, 224,224)

Tushar Nagarajan notifications@github.com于2018年10月9日 周二上午2:13写道:

Thanks again! I really appreciate the help. I am having a bit of trouble with inference though. I can load the model just fine, but there are details about preprocessing and inputs that aren't obvious to be from the scripts, and it would be helpful if a single script that just has the skeleton with an example is provided.

from SP import SP sp = SP(pretrained_spatial='pretrained_sp.pth.tar', pretrained_temporal='save/pretrained_sp.pth.tar') sp.model.load_state_dict(torch.load('pretrained_sp.pth.tar')['state_dict'])

This is as far as I got before I realized I don't know what the expected inputs are

is rgb a (B, 3, 224, 224) tensor?

is Flow a (B, 2, 224, 224) tensor?

output = sp.model(rgb, flow)

Thanks!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hyf015/egocentric-gaze-prediction/issues/12#issuecomment-427911957, or mute the thread https://github.com/notifications/unsubscribe-auth/AVSGeFwfo6GliQu_t56hcmQC6Fok1RGvks5ui4fJgaJpZM4XCdKD .