Reproducibility of results

JACOBWHY commented 6 months ago

Hi, I download the checkpoints and put them in the output directory, then I use the commands to train the three models, but I can't get the same results as in the table. Is there anything else that needs to be changed? Your results	Experiments	MAE	RMSE	R^2
Multi-Modal	5.13±0.05	6.90±0.07	67.6%±0.5
Teacher-student Distillation	4.90±0.04	6.57±0.06	71.1%±0.4

my results	Experiments	MAE	RMSE	R^2
Multi-Modal	5.07	6.90	68.1%
Teacher-student Distillation	4.99	6.75	69.6%

My steps:

download the weights from the link and create a folder named "output" and subfolders "css_seg" "end2end_model" "teacher_model", put the weights in the subfolders,respectively.
run commands in terminal one by one.

echonet seg_cycle --batch_size=20 --output=output/css_seg --loss_cyc_w=0.01 --num_epochs=25 --rd_label=920 --rd_unlabel=6440 --run_test --reduced_set 

echonet seg_cycle --batch_size=20 --output=output/css_seg --loss_cyc_w=0.01 --num_epochs=25 --rd_label=920 --rd_unlabel=6440 --skip_test --reduced_set --run_inference=train

echonet seg_cycle --batch_size=20 --output=output/css_seg --loss_cyc_w=0.01 --num_epochs=25 --rd_label=920 --rd_unlabel=6440 --skip_test --reduced_set --run_inference=val

echonet seg_cycle --batch_size=20 --output=output/css_seg --loss_cyc_w=0.01 --num_epochs=25 --rd_label=920 --rd_unlabel=6440 --skip_test --reduced_set --run_inference=test

mkdir ../infer_buffers/css_seg
mv output/css_seg/*_infer_cmpct ../infer_buffers/css_seg/

echonet video_segin --frames=32 --model_name=r2plus1d_18 --period=2 --batch_size=20 --output=output/teacher_model --num_epochs=25 --rd_label=920 --rd_unlabel=6440 --run_test --segsource=css_seg

echonet vidsegin_teachstd_kd --frames=32 --model_name=r2plus1d_18 --period=2 --batch_size=20 --output=output/end2end_model --num_epochs=25 --rd_label=920 --rd_unlabel=6440 --run_test --reduced_set --max_block=20 --segsource=css_seg --w_unlb=5 --batch_size_unlb=10 --weights_0=output/teacher_model/best.pt

Furthermore, when I train the model from scratch using the commands given, I get the results below. Is there any other way to get a better result than loading the weights that you've given? my results from scratch	Experiments	MAE	RMSE	R^2
Multi-Modal	5.31	7.26	64.8%
Teacher-student Distillation	5.59	7.55	61.9%

since I only have a 24G 3090 GPU, when run the vidsegin_teachstd_kd from scratch I set --batch_size=8 --batch_size_unlb=4

Besides, How to calculate the std ? Is it in the log.csv?

ackbar03 commented 6 months ago

Hi, thanks for your interest in our work!

Hi, I download the checkpoints and put them in the output directory, then I use the commands to train the three models, but I can't get the same results as in the table. Is there anything else that needs to be changed?

I think the results are mostly in line with what I have. The reported results are the average of running the training process 5 times, and the standard deviation (+/-) calculated from the 5 runs. The standard deviation can sometimes be quite large unfortunately, if you run the training process a few more times and take the average, it should be closer to the numbers I saw.

since I only have a 24G 3090 GPU, when run the vidsegin_teachstd_kd from scratch I set --batch_size=8 --batch_size_unlb=4

For this task the batch size makes a big difference. You can verify this by training a standard regression model on the video dataset with different batch sizes.

Besides, How to calculate the std ? Is it in the log.csv?

The std is calculated by running the training process 5 separate times. Calculating it from larger sample sizes would have been better, but it was too computationally expensive to do so.

Hope this helps

JACOBWHY commented 6 months ago

Thank you very much for your reply! I'll try and I wonder if there is any relationship between the Settings of batch_size and batch_size_unlb? For example, the batch_size is twice as many as the batch_size_unlb?

ackbar03 commented 6 months ago

Hi,

I did not investigate specifically the relationship between labeled and unlabeled batch size, if I recall I set the ratio that way mostly due to constraints on GPU memory :( .

xmed-lab / CSS-SemiVideo

Reproducibility of results #2