Open SISTMrL opened 2 years ago
Hi SISTMrL,
The second one is the dense pre-trained version of ASSIGN, used as a starting point for the sparse version (the first one). The difference in the model name string is the 'pt-[True/False]'; I manually appended the '-z' in the final model to avoid it being overwritten by other training configurations.
hello, that's to say, i should first reproduce the
hs512_e40_bs16_lr0.001_sc-None_h2h-False_h2o-True_o2h-True_o2o-True_m-v2-v1-att-v3-False-True_sd-0.5-False_os-ind_dn-1-gs_pf-e00_c0_sp-1_ihs-False_ios-False_bl-False-1.0-1.0_sl-False-False-0.0-1.0_mt-False_pt-False_gc0.0_ds3_Subject1
and on the basis of it , then i should reproduce the
hs512_e40_bs16_lr0.001_sc-None_h2h-False_h2o-True_o2h-True_o2o-True_m-v2-v1-att-v3-False-True_sd-0.1-True_os-ind_dn-1-gs_pf-e0s0_c0_sp-0_ihs-False_ios-False_bl-False-1.0-1.0_sl-True-False-4.0-1.0_fl0-0.0_mt-False_pt-True-z_gc0.0_ds3_Subject1
Is there anything wrong with my understanding
Hi SISTMrL,
The final ASSIGN model depends on a pre-trained dense version of it, so if you want to reproduce from the ground-up (i.e. train your own model), you first need to train this one: hs512_e40_bs16_lr0.001_sc-None_h2h-False_h2o-True_o2h-True_o2o-True_m-v2-v1-att-v3-False-True_sd-0.5-False_os-ind_dn-1-gs_pf-e00_c0_sp-1_ihs-False_ios-False_bl-False-1.0-1.0_sl-False-False-0.0-1.0_mt-False_pt-False_gc0.0_ds3_Subject1
Then use it as a start point to train this one (in the config file, you change pretrained: true
and pretrained_path: <path to pretrained model above>
):
hs512_e40_bs16_lr0.001_sc-None_h2h-False_h2o-True_o2h-True_o2o-True_m-v2-v1-att-v3-False-True_sd-0.1-True_os-ind_dn-1-gs_pf-e0s0_c0_sp-0_ihs-False_ios-False_bl-False-1.0-1.0_sl-True-False-4.0-1.0_fl0-0.0_mt-False_pt-True-z_gc0.0_ds3_Subject1
To reproduce the results in the paper you can use this one straight away in evaluation mode (as described in README.md) hs512_e40_bs16_lr0.001_sc-None_h2h-False_h2o-True_o2h-True_o2o-True_m-v2-v1-att-v3-False-True_sd-0.1-True_os-ind_dn-1-gs_pf-e0s0_c0_sp-0_ihs-False_ios-False_bl-False-1.0-1.0_sl-True-False-4.0-1.0_fl0-0.0_mt-False_pt-True-z_gc0.0_ds3_Subject1
thanks very much!
hi, RomeroBarata, i have a question about stage1 model, when i reproduce, the model name is:
hs512_e40_bs16_lr0.001_sc-None_h2h-False_h2o-True_o2h-True_o2o-True_m-v2-v1-att-v3-False-True_sd-0.5-False_os-ind_dn-1-gs_pf-e0s0_c0_sp-1_ihs-False_ios-False_al-1.0_bl-False-1.0-1.0_sl-False-False-0.0-1.0_fl0-0.0_mt-False_pt-False_gc0.0_ds3_Subject1
comparing to your stage1 model, my model is "e0s0", but yours is "e00", could you please explain it? thanks!
Moreover, i remember you said your experiments are conduct on single v100, what memory is it, i have 32g v100 and 16g v100 cards, which one should i choose to train the model
Looking forward to your reply , thanks very much
hello, in your released model, i found two version of one subject, such as:
hs512_e40_bs16_lr0.001_sc-None_h2h-False_h2o-True_o2h-True_o2o-True_m-v2-v1-att-v3-False-True_sd-0.1-True_os-ind_dn-1-gs_pf-e0s0_c0_sp-0_ihs-False_ios-False_bl-False-1.0-1.0_sl-True-False-4.0-1.0_fl0-0.0_mt-False_pt-True-z_gc0.0_ds3_Subject1
and
hs512_e40_bs16_lr0.001_sc-None_h2h-False_h2o-True_o2h-True_o2o-True_m-v2-v1-att-v3-False-True_sd-0.5-False_os-ind_dn-1-gs_pf-e00_c0_sp-1_ihs-False_ios-False_bl-False-1.0-1.0_sl-False-False-0.0-1.0_mt-False_pt-False_gc0.0_ds3_Subject1
could you please tell me the difference between this two version, looking forward to your reply, thanks!