wouterkool / attention-learn-to-route

Attention based model for learning to solve different routing problems
MIT License
1.04k stars 337 forks source link

Bug in Beam Search #46

Closed TimD3 closed 2 years ago

TimD3 commented 2 years ago

There might be a bug in the beam search. If my understanding is correct increasing the beam size should never worsen the result. When testing on very large problems (CVRP1000) with beam sizes up to 4096 I'm experiencing some inconsistent results. Up to like a beam size of 16 results get better and then they start fluctuate a bit with the tendency to get worse again. A beam size of 4096 yields the same result then as the beam size of two.

I'm running something like this: srun python eval.py \ data/vrp/vrp_uniform_1000.pkl \ -f \ --no_progress_bar \ --decode_strategy bs \ --eval_batch_size 1 \ --width 1 2 4 8 16 32 64 128 256 512 1024 2048 4096 \ --model pretrained/cvrp_50/epoch-99.pt \ --max_calc_batch_size 10000 \ --softmax_temperature 1

Is this known? Anything I should try?

wouterkool commented 2 years ago

Hi!

Thanks for using our code! This is an interesting finding but does not mean there is a bug (per se) for two reasons:

I am no expert, but I think similar results are seen when using beam search in NLP: bigger beam sizes give worse results and the 'search error' (from a small beam size) is actually used as a feature. See for example If beam search is the answer, what was the question?.

TimD3 commented 2 years ago

Ah yes you are right. My error of thinking was in your first point. I thought a beam size of 16 for instance must include all results from a beam size of 8 and only add more on to it. I didnt see how they would be pushed out later again because I thought only one expansion ahead so to say.

Thanks for the super fast response and also thanks for the code base!