Closed linCC111 closed 7 months ago
Hi, here are some possible fixes:
We recently implemented loading from Solomon instances here, could you have a look at it first?
The error
RuntimeError: index 98 is out of bounds for dimension 1 with size 21
means that there is some problem in some size, most probably a variable in the td
get_action_mask() in cvrptw.py.But in the step() in cvrptw.py ,it will go to the step() in cvrp.py and will run the get_action_mask() in cvrp.py
Yes, the action mask in CVRPTW is a composition of the TW and the base CVRP masking procedure by design - if you want to remove some constraints, you may need to refactor the code
mocvrptw
? PS: I just tried running the code (for the CVRPTW we have), updating instances worked:
from rl4co.utils.ops import gather_by_index, get_distance
current_loc = gather_by_index(td["locs"], td["current_node"]).reshape(
[batch_size, 2]
)
dist = get_distance(current_loc, td["locs"].transpose(0, 1)).transpose(0, 1)
td["distances"] = dist
With the policy now working but obtaining:
AssertionError: Logits contain NaNs
which means that usually, the action_mask
contains all "False" for some instance (i.e. nodes cannot be visited)
1.How can I use the extract_from_solomon().I try "python":
batch_size = 1
instance = vrplib.read_instance(data_path, instance_format= "solomon")
policy, env = test_model.policy, test_model.env
env.extract_from_solomon(instance, batch_size)
but falied
2.I change the reward in cvrptw.py trying to make it a multi-objective problem.And the action_mask constraints doesn't need change.
batch_size = 1
instance = vrplib.read_instance(data_path, instance_format= "solomon")
policy, env = test_model.policy, test_model.env
td = env.extract_from_solomon(instance, batch_size)
print(td)
works fine for me:
TensorDict(
fields={
action_mask: Tensor(shape=torch.Size([1, 101]), device=cpu, dtype=torch.bool, is_shared=False),
current_loc: Tensor(shape=torch.Size([1, 2]), device=cpu, dtype=torch.float32, is_shared=False),
current_node: Tensor(shape=torch.Size([1, 1]), device=cpu, dtype=torch.int64, is_shared=False),
current_time: Tensor(shape=torch.Size([1, 1]), device=cpu, dtype=torch.float32, is_shared=False),
demand: Tensor(shape=torch.Size([1, 100]), device=cpu, dtype=torch.float32, is_shared=False),
depot: Tensor(shape=torch.Size([1, 2]), device=cpu, dtype=torch.float32, is_shared=False),
distances: Tensor(shape=torch.Size([1, 101]), device=cpu, dtype=torch.float32, is_shared=False),
done: Tensor(shape=torch.Size([1, 1]), device=cpu, dtype=torch.bool, is_shared=False),
durations: Tensor(shape=torch.Size([1, 101]), device=cpu, dtype=torch.int64, is_shared=False),
locs: Tensor(shape=torch.Size([1, 101, 2]), device=cpu, dtype=torch.float32, is_shared=False),
terminated: Tensor(shape=torch.Size([1, 1]), device=cpu, dtype=torch.bool, is_shared=False),
time_windows: Tensor(shape=torch.Size([1, 101, 2]), device=cpu, dtype=torch.int64, is_shared=False),
used_capacity: Tensor(shape=torch.Size([1, 1]), device=cpu, dtype=torch.float32, is_shared=False),
vehicle_capacity: Tensor(shape=torch.Size([1, 1]), device=cpu, dtype=torch.int64, is_shared=False),
visited: Tensor(shape=torch.Size([1, 1, 101]), device=cpu, dtype=torch.uint8, is_shared=False)},
batch_size=torch.Size([1]),
device=None,
is_shared=False)
However, when running the policy, there seems to be an infinite loop with this minimalistic code:
%load_ext autoreload
%autoreload 2
import torch
import vrplib
from rl4co.models import AttentionModelPolicy
from rl4co.envs import CVRPTWEnv
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
data_path = "./data/C101.txt"
env = CVRPTWEnv()
policy = AttentionModelPolicy(env).to(device)
# Use Solomon instance
vrplib.download_instance("C101", "data")
instance = vrplib.read_instance(data_path, instance_format= "solomon")
td = env.extract_from_solomon(instance, 1).to(device)
## Use randomly generated instance
# td = env.reset(batch_size=(2,)).to(device)
# Inference on policy
with torch.inference_mode():
out = policy(td.clone(), decode_type= 'greedy', num_starts= 0, return_actions= True)
gets an infinite loop. If you uncomment and use the randomly generated instance, then it works fine. I guess it is a small masking problem? We will have a look at this. In the meantime: we recommend using a similar structure as this notebook!
2. I see!
I run the code without randomly generated instance and when go to the decoder,the log_p get
tensor([[0., -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf]], device='cuda:0')
It seems the model cannot generate true result with the input data extract from solomon
Yes, that happens because there is some error in the masking procedure - I think it's due to some mistakenly "unreachable" node, we will have a look at it :)
1.I have solved the problem,by modifying my initial code according to the notebook you provide.I add the td["distances"] which means the distances between the depot and other nodes at the first step, as the initial td["distances"].The error(index out of bounds) doesn't occur.
td["distances"] = get_distance(td["locs"][:, 0, :], td["locs"][:, 1:, :].transpose(0, 1)).transpose(0, 1)
The complete data which need initialize as follow:
batch_size = 1
td = env.reset(batch_size= (batch_size, )).to(device)
td["locs"] = repeat(coords, 'n d -> b n d', b= batch_size, d= 2)
td["distances"] = get_distance(td["locs"][:, 0, :], td["locs"][:, 1:, :].transpose(0, 1)).transpose(0, 1)
td["time_windows"] = repeat(time_window, 'n d -> b n d', b= batch_size, d= 2)
td["durations"] = repeat(serve_time, 'n -> b n', b= batch_size)
td["demand"] = repeat(demands, 'n -> b n', b= batch_size) / capacity
td["visited"] = torch.zeros((batch_size, 1, n), dtype= torch.uint8)
action_mask = torch.ones(batch_size, n, dtype= torch.bool)
action_mask[:, 0] = False
td["action_mask"] = action_mask
print(td)
2.The extract_from_solomon() still work failed.
Regarding 2: are you using the latest version of the main branch?
It seems the model cannot generate true result with the input data extract from solomon
This is indeed an issue we have encountered: when verifying the optimal solutions provided by cvrplib using the check_solution_validity() function, our validity checks return invalid routes for many of the instances. This is due to how float point calculations are handled and can lead to small deviations in the calculated distances. These small calculations can lead to our validity checks asserting that a node cannot be reached in time (by a small margin, of course), even though it is part of the best route. If this is an issue for the optimal routes, it would surely be an issue during training the same instances.
We do not have a solution implemented yet to handle these small discrepancies, but we have a few ideas - still working on it, though.
I used the 0.3.1 at first and test the 0.4.0(it ought to be the latest version?) just now.
Regarding 2: are you using the latest version of the main branch?
The same problem occurs.
the running procedure gets an infinite loop.
Hi @huanxulin1, we found the bug and fixed it in 5e90b77. The reason is that in the previous cvrp
environment get_action_mask()
function, the vehicle_capacity
is separately passed instead of using the variable inside the tensordict. This causes all nodes to be wrongly masked because of the capacity constraint.
Now you could use the following minimum code to check the running:
import torch
import vrplib
from rl4co.envs import CVRPTWEnv
from rl4co.models import AttentionModelPolicy
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
data_path = "./data/C101.txt"
env = CVRPTWEnv()
policy = AttentionModelPolicy(env).to(device)
# Use Solomon instance
vrplib.download_instance("C101", "data")
instance = vrplib.read_instance(data_path, instance_format= "solomon")
td = env.extract_from_solomon(instance, 1).to(device)
# Inference on policy
with torch.inference_mode():
out = policy(td.clone(), decode_type= 'greedy', num_starts= 0, return_actions= True)
print(out)
Here is the output from my side with an untrained AM just for your reference:
{'reward': tensor([-4923.0938], device='cuda:2'), 'log_likelihood': tensor([-226.6587], device='cuda:2'), 'actions': tensor([[ 75, 0, 47, 0, 1, 0, ..., 0, 87]], device='cuda:2')}
And about your another concern:
I run the code without randomly generated instance and when go to the decoder,the log_p get
tensor([[0., -inf, -inf, ..., -inf]], device='cuda:0')
It seems the model cannot generate true result with the input data extract from solomon
Don't worry, this is correct, in our design, the log_p
would be -inf
if the node is masked in this step. So the log_p
you saw means that only the depot is available.
Describe the bug
when I use a checkpoint test in a solomon dataset, I got a error(index out of bounds), and it seems happen in the step function in the env,the td["distances"] is (1, 21), but the action index is big than 20.
To Reproduce
Reason
I had read the decoder.py and mocvrptw.py(my own env base on cvrptw.py), and found that the td["distances"] updates in the get_action_mask() in cvrptw.py.But in the step() in cvrptw.py ,it will go to the step() in cvrp.py and will run the get_action_mask() in cvrp.py.I think it causes the error.