lfz / DSB2017

The solution of team 'grt123' in DSB2017
MIT License
1.24k stars 418 forks source link

how much score in the leaderboard? #21

Open mileyan opened 7 years ago

mileyan commented 7 years ago

I have get the result and submitted to kaggle leaderboard. But the private score is 0.43, much larger than 0.39.

hflyzju commented 7 years ago

have you finished all step in run_training.sh?

m-wei commented 7 years ago

I want to know whether your test result is like ,such as

id cancer 00cba091fa4ad62cc3200a657aeb957e 0.337091803551

and the file such as 00cba091fa4ad62cc3200a657aeb957e_pbb.npy is like :

np.array[[[ 6.22880459e-03, 8.09559159e+01, 1.34717508e+01, 2.13114373e+02, 9.21659704e+00], [ 5.31812131e-01, 1.34316428e+02, 5.45498898e+01, 4.98600769e+01, 9.00401013e+00],]] can you give me some help? thank you very much!! @mileyan @hflyzju @lfz

hflyzju commented 7 years ago

the same as you! i now trying to get the stage2 result and submitted to kaggle leaderboard to check the score。@m-wei

m-wei commented 7 years ago

do you know what the result means? I think the detector's result should be a coordinate like x.y.z,so i want to know what np.array[[[ 6.22880459e-03, 8.09559159e+01, 1.34717508e+01, 2.13114373e+02, 9.21659704e+00]] means,or how to get the coordinate?

For classify's result,do you know how to get the top 5 score rather than the final cancer's score.In other words,can you give me some help about the function of the final score like 0.371 because i think the final score is useless,thank you very much ! If you know the answer ,can you give me your weixin or qq? @hflyzju @mileyan

lfz commented 7 years ago

Hi, we have noticed this problem too. We are really sorry that the codes are not finalized yet.

The reason is that the score on the leader board is the outcome of a series of sophisticated hand tuning. We tried numerous (>10) hyperparameter settings sequentially, each starting from the epoch with lowest validation loss in the previous session. So, in fact, the effective epoch number is over 1000. The configurations here (net_classifier_3 and net_classifier_4) are the initial and final hyperparameter settings. We thought they might suffice to reproduce the data, but clearly not.

We tried to use net_classifier_4 directly and enable gradient clipping (line 63 in /training/classifier/trainval_classifier.py and line 49 in training/classifier/trainval_detector.py), and got a result of 0.41. Of course, this is not satisfying either.

We are now actively solving this problem and will release a new version later.

shu-hai commented 7 years ago

@lfz . I just ran the code for stage 2 testing data directly with the same of your setup (except # of gpus and n_workers) in config_submit.py (without tuning the training network). The score I got from Kaggle leaderboard is 0.62144. Should I need to tune your network first from training data?

@m-wei @hflyzju @mileyan How about you?

@lfz Is that possible for you to put up the trained model with already tuned parameters on the github?

hflyzju commented 7 years ago

@shu-hai without training network,the test model score is 0.62144.

lfz commented 7 years ago

@shu-hai @hflyzju

that's strange, because the test model has been confirmed by us and several other people, please provide more info? About the framework you use and please provide some intermediate values such as the detection results?

lfz commented 7 years ago

@shu-hai @hflyzju @m-wei @mileyan

Oh, I see, 0.61244 is the correct output of this model, what you see is the public leader board score, which is highly noisy. You should check the score on private leader board. For example:

image

If you can not see this, you may need to download the stage2 solution file and calculate the cross-entroy by yourself

https://kaggle2.blob.core.windows.net/forum-message-attachments/181470/6466/stage2_solution.csv

lfz commented 7 years ago

@m-wei to get the probability of every nodule, refer to the variable "out" in Casenet

shu-hai commented 7 years ago

@lfz finally got a score = 0.40164555994871559

hflyzju commented 7 years ago

default i can get a score = 0.38992,the detect model is after myself training while the classify model not。

lfz commented 7 years ago

that is cool, could you share the training script

2017年5月25日星期四,hflyzju notifications@github.com 写道:

[image: default] https://cloud.githubusercontent.com/assets/20268798/26432896/a2b3a314-4131-11e7-844c-25deb7367301.PNG i can get a score = 0.38992,the detect model is after myself training while the classify model not。

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lfz/DSB2017/issues/21#issuecomment-303902836, or mute the thread https://github.com/notifications/unsubscribe-auth/AIigQzEm4tTJo_vZWa3y2BlJVXs3RcgYks5r9OLvgaJpZM4NjNkL .

-- 廖方舟 清华大学医学院 Liao Fangzhou School of Medicine Tsinghua University Beijing 100084 China

hflyzju commented 7 years ago

I first check if there is anything wrong, if not I will share it out.

FernandoTN commented 7 years ago

@hflyzju do u mind sharing the script, it would be greatly appreciated.

llj098 commented 6 years ago

with the same config/code/trainingset, I can get a 0.39814

lfz commented 6 years ago

@llj098 Thank you for the confirmation~

Luan-zb commented 3 years ago

使用相同的配置/代码/训练集,我可以得到一个 0.39814

with the same config/code/trainingset, I can get a 0.39814 Hello, do you run the script file after directly finding the public data set on the Internet and directly replacing the corresponding directory?