Updating the rl4co/tasks/eval.py for the latest version. I created this quick merge PR to write down the usage tutorial.
Motivation and Context
Fixing the first node selection problem for the sampling method;
A parser is used to launch the evaluation efficiently.
Types of changes
[x] Bug fix (non-breaking change which fixes an issue)
[x] New feature (non-breaking change which adds core functionality)
Tutorial for the evaluation
Step 1. Prepare your pre-trained model checkpoint and test instances data file. Put them in your preferred place. e.g., we will test the AttentionModel on TSP50:
You could check the rl4co/tasks/eval.py to see more supporting parameters with hints. Here are some notes:
We are now supporting 7 evaluation methods: greedy, sampling, multistart_greedy, augment_dihedral_8, augment, multistart_greedy_augment_dihedral_8, and multistart_greedy_augment.
The parameter --model is the class name, for example, use AttentionModel, POMO, SymNCO, etc.
By default, the evaluation results will be saved as a .pkl file under the --save_path. This file includes actions, rewards, inference_time, and avg_reward. You could collect them for the next step processing.
There are some parameters that are not commonly modified, so they are not in the parser list. For example, select_best=True for sampling evaluation. In the current version, you may want to hardcode and modify it.
Step 3. If you want to launch several evaluations with various parameters, you may refer to the following examples:
Evaluate POMO on TSP50 with a sampling of different Top-p and temperature:
#!/bin/bash
top_p_list=(0.5 0.6 0.7 0.8 0.9 0.95 0.98 0.99 0.995 1.0)
temp_list=(0.1 0.3 0.5 0.7 0.8 0.9 1.0 1.1 1.2 1.5 1.8 2.0 2.2 2.5 2.8 3.0)
problem=tsp
model=POMO
ckpt_path=checkpoints/pomo-tsp50.ckpt
data_path=data/tsp/tsp50_test_seed1234.npz
for top_p in ${top_p_list[@]}; do
for temp in ${temp_list[@]}; do
python rl4co/tasks/eval.py --problem ${problem} --model ${model} --ckpt_path ${ckpt_path} --data_path ${data_path} --method sampling --temperature=${temp} --top_p=${top_p} --top_k=0
done
done
Evaluate POMO on CVRP50 with a sampling of different Top-k and temperature:
#!/bin/bash
top_k_list=(5 10 15 20 25)
temp_list=(0.1 0.3 0.5 0.7 0.8 0.9 1.0 1.1 1.2 1.5 1.8 2.0 2.2 2.5 2.8 3.0)
problem=cvrp
model=POMO
ckpt_path=checkpoints/pomo-cvrp50.ckpt
data_path=data/vrp/vrp50_test_seed1234.npz
for top_k in ${top_k_list[@]}; do
for temp in ${temp_list[@]}; do
python rl4co/tasks/eval.py --problem ${problem} --model ${model} --ckpt_path ${ckpt_path} --data_path ${data_path} --method sampling --temperature=${temp} --top_p=0.0 --top_k=${top_k}
done
done
🙌 I will update one notebook for loading the results and do some statics soom.
Description
Updating the
rl4co/tasks/eval.py
for the latest version. I created this quick merge PR to write down the usage tutorial.Motivation and Context
Types of changes
Tutorial for the evaluation
Step 1. Prepare your pre-trained model checkpoint and test instances data file. Put them in your preferred place. e.g., we will test the
AttentionModel
on TSP50:Step 2. Run the
eval.py
with your customized setting. e.g., let's use thesampling
method with atop_p=0.95
sampling strategy:You could check the
rl4co/tasks/eval.py
to see more supporting parameters with hints. Here are some notes:greedy
,sampling
,multistart_greedy
,augment_dihedral_8
,augment
,multistart_greedy_augment_dihedral_8
, andmultistart_greedy_augment
.--model
is the class name, for example, useAttentionModel
,POMO
,SymNCO
, etc..pkl
file under the--save_path
. This file includesactions
,rewards
,inference_time
, andavg_reward
. You could collect them for the next step processing.select_best=True
for sampling evaluation. In the current version, you may want to hardcode and modify it.Step 3. If you want to launch several evaluations with various parameters, you may refer to the following examples:
Evaluate POMO on TSP50 with a sampling of different Top-p and temperature:
Evaluate POMO on CVRP50 with a sampling of different Top-k and temperature:
🙌 I will update one notebook for loading the results and do some statics soom.