Open sancarlim opened 2 years ago
Turns out training from scratch without pre-training with ground truth traversals leads to the same results and makes training simpler. That's why I decided to just drop the pre-training step. If you still want to pretrain, you can set the pre_train flag to True on line 48 in the config file.
Regarding combinations of enc/agg/dec - you're right, not every decoder will work with every aggregator. There are many possible combinations though. I'll share more configs after running some tests.
Perfect , thanks for the quick reply ! In the meanwhile, could you provide the config you used for the decoder ablation described in the paper? Thank you!
Thanks for the great work and the repository! I have trained the model from scratch and it yields similar results ( a bit worse but almost the same), but shouldn't we pre-train for 100 epochs and then finetune for other 100 as stated in the paper? If so, I think it would be good to indicate it in the README. It would also be good to indicate how to combine the different enc/agg/dec to reproduce the jobs in the benchmark, or what configurations are possible at all, as some aggregator outputs would not match some decoders - maybe providing different .yml files? Thank you!