Closed Sucran closed 4 years ago
@Ugness Sorry, the threshold I test is 0.8. I have no test your option yet, I need to wait for any available GPU in my lab. I also have a problem, can you test the MAE result on all DUTS-TE without modifying the dataset? The difference of MAE result confused me.
What do you mean by without modifying the dataset? I am going to upload all the result (MAE, F-measure, threshold) on google drive. I also upload the list of image file names in my DUTS-TE Dataset with it.
@Ugness. I mean it should be 5019 images in DUTS-TE without deleting mismatching files, you should test on all 5019 images.
While DUTS-TE-Mask has 2 more images than DUTS-TE-Image? My DUTS-TE-Image folder has 5019 images. I deleted 2 images from DUTE-TE-Mask because there was 5021 images.
Hi, @Ugness , I intergrate your flie of measure.py and train.py ,but I don't change the file of network.py . now , I set the value of batch_size is 2, at the first falling of learning rate,my train loss can falling .but after that,althought my learning rate falling ,my train loss never falling. And , I test my model on PASCAL-S ,the best value of MAE is 0.1243.could you help me and sovle this problem?
@RaoHaobo Can you give me some captures of your loss graph?? You may found it on Tensorboard. And I think it is better to make new issue. Thanks.
I change the learning rate decrease by per 15000, but the case of train loss never falling is the same of your 7000. I results of train loss and learning rate as follows. Thanks!
I think that graph looks fine.
But if you think that loss should be more less, I recommend you to increase lr decay rate and lr decay step. The hyper parameters on my code, I just followed the implementation on PiCANet paper with DUTS Dataset. And about MAE, it may related to batch size. When I changed batchsize 1 to 10(may be 4 I do not remember correctly), the performance was incrementally increased.
I’ll let you know the specific value of score when I found the past results. And I’ll also upload the loss graph of my experiment as well. Thanks.
I make the Ir decay rate from 0.1 to 0.5,the Ir decay steps is 7000.And my loss as follows
Why the train loss never falling after one opeoch?Do you meet the problem?
nice work and nice code! When I run 'python train.py --dataset ./DUTS-TR', there occurs a error:(it seems something wrong with tensorboardX, but i have no idea what to do): thanks for your reply~
@Dylanqyuan version of your tensorboardX is too high.
@Dylanqyuan version of your tensorboardX is too high.
It works! Thank you buddy!
@RaoHaobo https://github.com/Ugness/PiCANet-Implementation/issues/16#issuecomment-479510008 I’ve uploaded my graph on that link. And I also suggest you to follow that links 3 steps to check if model is trained or not.
My graph is also fluctuating as like as yours, and looks it is not decreasing. And for your graph, I am concerned about the learning rate. I think it became too small to train the model effectively after 1 epoch. But I did not had such experiment about that, so it’s just my personal opinion.
If you want to check your models performance, I suggest you to follow the steps on the link. If you are worrying about non-decreasing training loss, I suggest you (and also I) to have more experiments with learning rate and the other hyperparameters. In detail,
p.s. please comment at #17 if you want to talk about this issue more. To make easy to find!
@Ugness I test your the '36epo_383000step.ckpt' on PASCAL-S ,and the result is
,but your result is
,
why?
another problem:I add some my ideal on your code ,and I train my model is well:
but when I test my model on your this code :measure_test.py ,the test result is
@Ugness the second problem have been solved ,the first isn't sovled
Sorry. I forgot to mention that all of my experiment results are on DUTS dataset only. I updated my readme file. If you got my result from the README.md, I trained and tested the model on ONLY DUTS Dataset. So the result on PASCAL-S dataset may differ.
@Ugness Ok, this is your trained model ,and I use it to test on PASCAL-S and SOD ,the max_F is 0.8379.Could you test your model on other dataset?
@Ugness this code on your measure_test.py . but the github.com/AceCoooool/DSS-pytorch solver.py is ,I think they have big different.
I've made that .sum(dim=-1) because my code evaluates several images on parallel. github.com/AceCoooool/DSS-pytorch solver.py calculates the prec / recall on a single image at once, and my code calculates all images at once. The whole dimension of y_temp and mask is (batch, threshold_values, H, W). If I execute .sum() like as below code, it would sum all values in y_temp. Although we should sum only on H and W axis. And for the 1e-10, I made them for avoiding division by zero problem. If you think my explanation is wrong, please give me your advice. Thanks.
@Ugness i mean that tp + 1e-10,the 1e-10 maybe take out ,I try to take out it ,but the max_F falling much. I also use the Dss code to test your model on DUTS-TE,the result is bad.
How much difference that follows from the error? Is the difference significant? Let me know it. Thanks.
when threhold equal to 1,the prec must be 0,but your result equal to 1
@Ugness The writer.add_pr_curve() function in measure.py can't work , It never show in the tensorboard . I think it caused by the version tensorboard.
https://github.com/tensorflow/tensorboard/releases Would you try it with the tensorboard 1.8.0???
Hi, @Ugness I met a RAM memory leak problem when running network.py and train.py, this issue confused me for a few days. I have run other pytorch repo which is OK. I run the code in Ubuntu 14.04, Pytorch 0.4.1, CUDA 8.0, cudnn 6.0.