Open HubHop opened 4 years ago
Sorry for the late reply.
The original code with reproducible code/results is provided in another issue: https://github.com/airsplay/R2R-EnvDrop/issues/11.
Given the results in xlsx, the 2% drop in SPL (46%) is possibly caused by the drop in SR, which is still much higher compared to previous SotA (38%). The reason I currently find is some implementation differences inside the speaker when I cleaned the code (the original, reproducible code is provided in the other issue). Since the beam-search results which only rely on the inference of the speaker also changed. I haven't located which differences cause this issue. All the differences seem not to affect the training/inference process but the predictions are actually changed. Please kindly check the original code before I find it.
Best, Hao
Thanks for your reply!
please help me ! After I train the model, i use the test environment to evaulate,the success rate result is below, i dont understand why the result is so low? please help me, is there something wrong when i test ? image the test script is: name=agent flag="--train validlistener --featdropout 0.3 --angleFeatSize 128 --feedback argmax --mlWeight 0.2 --subout max --dropout 0.5 --optim rms --lr 1e-4 --iters 80000 --submit" CUDA_VISIBLE_DEVICES=$1 python r2r_src/train.py $flag --name $name
Hi @jingquanliang , I didn't see your result, have you fixed it? Or you can try this script.
name=agent_bt flag="--attn soft --train validlistener --load snap/agent_bt/state_dict/best_val_unseen --angleFeatSize 128 --submit --featdropout 0.4 --subout max --maxAction 35"
CUDA_VISIBLE_DEVICES=$1 python r2r_src/train.py $flag --name $name
Hi,
We are trying to retrain the EnvDrop model based on this repo, but the results are not same as reported in paper, we have tried different PyTorch versions, our best result with PyTorch 0.4.1 is 0.46, which is less than the reported 48% on val unseen dataset in terms of SPL, for detailed results you can refer to the attachment below.
Have we missed something important? or can you specify your working environment?
Our retrained model: retrained_envdrop_results.xlsx
Results in paper: