ndrplz / dreyeve

[TPAMI 2018] Predicting the Driver’s Focus of Attention: the DR(eye)VE Project. A deep neural network learnt to reproduce the human driver focus of attention (FoA) in a variety of real-world driving scenarios.
https://arxiv.org/pdf/1705.03854.pdf
MIT License
99 stars 33 forks source link

can't find computer_vision_utils file #5

Closed NingMingHao closed 4 years ago

NingMingHao commented 5 years ago

Recently I'm trying to achieve your experiment, but I can't find the computer_vision_utils file, so I self-defined the read_image function as follows:

def read_image(file_path,resize_dim,channels_first=True,color=True):
    if color:
        raw_image = cv2.imread(file_path)
    else:
        raw_image = cv2.imread(file_path,0)
    raw_image = raw_image.astype(np.float32)
    resized_image = cv2.resize(raw_image,resize_dim)
    if channels_first:
        return resized_image.transpose(2,0,1)
    else:
        return resized_image

But the prediction performance is really bad, so I'm wondering if I have missed some important operation? By the way, is the 'dreyeve_mean_frame.png' generated as the mean of the first 37 runs? And the semseg_branch also performs badly, I think there must be some wrong with my code. So I attach my code here, please help me. Here is my models.py: models.py.txt

ndrplz commented 5 years ago

Hi @NingMingHao ,

I can't find the computer_vision_utils file, so I self-defined the read_image

read_image refers to this implementation. Probably we should add this kind of utils functions inside the dreyeve repo, thanks for pointing it out.

is the 'dreyeve_mean_frame.png' generated as the mean of the first 37 runs?

It is indeed.

semseg_branch also performs badly

I don't understand if the problem lies in the semantic segmentation network or in the dreyeve branch that predicts the gaze from the segmentation. Anyway, two suggestions:

Best, A

NingMingHao commented 5 years ago

Really thanks for your rapid reply. And I will try your suggestion to remove the segmentation branch. The reason why I upload my models.py is that I'm using tensorflow as the backend of keras, it's a pity that tensorflow doesn't support to resize a tensor using scale_ratio, so you can find that I have commented some your code, and used coarse_h = Lambda(lambda x: tf.transpose(tf.image.resize_bilinear(tf.transpose(x,perm=[0,2,3,1]),[_w*4,_w*4],name='{}_4x_upsampling'.format(branch)),perm=[0,3,1,2]))(coarse_h) Maybe there is something wrong here, and I will check it out. Thanks again!

NingMingHao commented 5 years ago

I have test this code: coarse_h = Lambda(lambda x: tf.transpose(tf.image.resize_bilinear(tf.transpose(x,perm=[0,2,3,1]),[_w*4,_w*4],name='{}_4x_upsampling'.format(branch)),perm=[0,3,1,2]))(coarse_h) I'm sure it works properly,

  1. 8x8 resize in la_in

  2. 8x8 resize out la_out

  3. 4x4 resize in c_out

  4. 4x4 resize out c_out_la

And finally, I have a look at your model weights, I find that all the convolutional bias weights of SaliencyBranch are 0, I'm not sure if this is the reason? screenshot from 2019-03-04 14-48-00

finally, this is the output the im_net gives. 000019

varunjammula commented 4 years ago

@ndrplz Hi, I am trying to run predict_dreyeve_sequence.py file. It requires dreyeve_mean_frame.png file. How can I generate this file?

ndrplz commented 4 years ago

Hi @varunjammula you can generate it as the average of all frames of all sequences of the training set.

varunjammula commented 4 years ago

Thanks for clarifying the issue above. I have a new issue now. Where can I get dreyevenet_model_central_crop.h5 model?