chenxy99 / AttFDNet

MIT License
0 stars 0 forks source link

Some files can not find #3

Open yangt1013 opened 3 years ago

yangt1013 commented 3 years ago

Mistakes: from dataloader import salicon(this is exist but can't import**) from evaluation import cal_cc_score, cal_sim_score, cal_kld_score, cal_auc_score, cal_nss_score, add_center_bias(the file is not exist) from unet import standard_unet (not exist) from loss import NSS, CC, KLD, cross_entropy(not exist) from sam import SAM(not exist)

It has some bugs,named no module.The path is right , where is your sam and loss files definition ,I can't find the files exist.please sovle these problems. Thanks for sharing your codes,I am waiting for your reply.

chenxy99 commented 3 years ago

Hello, thank you for pointing out this one.

Actually, I mistake to put the main.py in this project. This main.py is written for another project. So I would remove this file.

You can train the base model with train_RFB.py and test it with test_RFB.py, while you can train the novel stage model with train_RFB_target.py and test it with test_RFB_target.py. More details can refer to Start training of the Readme.md.

Thanks.

yangt1013 commented 3 years ago

Hello, thank you for pointing out this one.

Actually, I mistake to put the main.py in this project. This main.py is written for another project. So I would remove this file.

You can train the base model with train_RFB.py and test it with test_RFB.py, while you can train the novel stage model with train_RFB_target.py and test it with test_RFB_target.py. More details can refer to Start training of the Readme.md.

Thanks for your reply! I got it. I am very interested in your models design .But find some questions in your paper, the saliency model is the bottom up attention source.The bottom up attention framwork is not explanations clearly.How to design the bottom up attention and How to insert it to vgg models,conv channels or others ,the position channel is64/128/256/512? how to train the SAM and bottonm up attention get? How these are reflected in your code is not explained in the readme.

I am waiting for your reply!

chenxy99 commented 3 years ago

We follow the architecture design of paper Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model trained on SALICON dataset. In our object detection model, we pretrain the SAM on SALICON dataset and then froze the parameters of SAM and directly use the saliency prediction of the fed images in few-shot object detection. In this repository, we use the time-efficiveness approach called Boolean Map Saliency algorithm, which is not a deep learning approach and needn't any training procedure. In our paper, BMS achieves similar performance as SAM does.

Also, as mentioned in the paper, they also directly add to the conv4_3 of VGG.

Thanks.

yangt1013 commented 3 years ago

We follow the architecture design of paper Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model trained on SALICON dataset. In our object detection model, we pretrain the SAM on SALICON dataset and then froze the parameters of SAM and directly use the saliency prediction of the fed images in few-shot object detection. In this repository, we use the time-efficiveness approach called Boolean Map Saliency algorithm, which is not a deep learning approach and needn't any training procedure. In our paper, BMS achieves similar performance as SAM does.

Also, as mentioned in the paper, they also directly add to the conv4_3 of VGG.

Thanks.

Both SAM and BMS are all using?What parameters SAM passes to BMS, BMS can aslo generate saliency detection pictures,what parameters SAM pass , their are the channels and spatial attention mechanisms used here all, Which file is reflected in the code? I am waiting for your reply!

chenxy99 commented 3 years ago

The SAM and BMS both have a similar function. When given an original image [3, H, W], the SAM and BMS would generate a saliency prediction [1, H', W']. So we only use either SAM or BMS to provide information for conv4_3 of VGG backbone, the corresponding code. We provide the bms file code from Boolean Map Saliency algorithm.

yangt1013 commented 3 years ago

The SAM and BMS both have a similar function. When given an original image [3, H, W], the SAM and BMS would generate a saliency prediction [1, H', W']. So we only use either SAM or BMS to provide information for conv4_3 of VGG backbone, the corresponding code. We provide the bms file code from Boolean Map Saliency algorithm.

Thanks! In your code ,/data/dataloder file is definition to SAM on salicon dataset, But BMS files can not import to models RFB ,I really eager to know which one you use SAM or BMS.Maybe you can explain the code detaileds of these files in the readme.

chenxy99 commented 3 years ago

Hello, yes, the ./data/dataloder.py is defined to SAM on salicon dataset and actually it is not used in this repository. I would also remove it to leave the related files of this project. We do not import the BMS file to models RFB, we only use it to generate the data in ./data/voc0712.py to generate the corresponding training data. Sure, I would add some new information in Readme.md file.

Thanks.

yangt1013 commented 3 years ago

Hello, yes, the ./data/dataloder.py is defined to SAM on salicon dataset and actually it is not used in this repository. I would also remove it to leave the related files of this project. We do not import the BMS file to models RFB, we only use it to generate the data in ./data/voc0712.py to generate the corresponding training data. Sure, I would add some new information in Readme.md file.

Thanks.

I read the voc0712.py codes and find some questions, first,the meaning of the first class definition,second,the definition about bms_img pass to any position? How to fusion the img and bms_img? In your paper, generate the bms_image to conv4_3 mix.How to reflect in the code, here only see the initialization definition.

yangt1013 commented 3 years ago

I am waiting for your reply!

chenxy99 commented 3 years ago

Hello, for the first question, I guess you mean that the difference between class VOCDetection and class VOCDetection_fewshot_new. The first one is used for the base training stage and the second is used in the novel training stage, which uses a few images from the base and novel categories. Second, I take the base training stage as an example. We generate the images, targets, and bms_imgs from the dataloader in code images, targets, bms_images = next(batch_iterator), and then send images and bms_imgs to the model code out = net(images, feed_conv_4_3), they are merged in code x = x * torch.log(feed_conv_4_3 + 2.71828).

Thanks.

yangt1013 commented 3 years ago

Hello,I cannot find the dataloader files, voc0712.py is a definition of img,target, bms_img.The file cannot indicate that the iteration part has been defined to use after initialization.I cannot see the output results to net and how the network feed or transfers.Please complete the relevant files clearly.The dataloader files are Missing ! I am waiting for your reply! Thanks

chenxy99 commented 3 years ago

Actually, I make a little mistake from my previous answer. The class VOCDetection and class VOCDetection_fewshot_new are the dataset class. And in the training file, we use

batch_iterator = iter(data.DataLoader(dataset, batch_size,
                                                  shuffle=True, num_workers=args.num_workers,
                                                  collate_fn=detection_collate))

to create a batch iterator. The output results to net and how the network feed or transfers can refer to Line226-268 in CODE.

yangt1013 commented 3 years ago

hello, I meet some questions in your coding.The parameters of "multiply_embedding = 20 "in your RFB_Net_vgg file,why defined this parameters and the meanings of this parameters? "embedding_source = conf.view(-1, self.num_classes multiply_embedding)"in your code is working? "embedding_source_output = embedding_source_norm.view(loc.size(0), -1, self.num_classes multiply_embedding)" in the forward output using?