Closed ujjwaldasari10 closed 2 months ago
The reason is that we changed the API: now we have a separate Generator
class that handles the data generation. So you may call it like this:
env = TSPEnv(generator_params={'num_loc': 50})
(example here)
Thank you. I have one more naive question possibly. Is it possible to train the models on multiple gpus?
Yes. You could set the trining with multi GPUs by editing the parameter trainer.devices
.
For example, if you want to try the example experiment in the README (AM on TSP) with multiple GPUs, you can launch the training with
python run.py +trainer.devices="[0, 1]"
This training will use coda:0
and coda:1
. And you can set it to more devices if you want.
Here are some useful materials you may want to refer to:
Do you guys plan on including MCTS in the list of decoding strategies in the near future?
@ujjwaldasari10 We are not planning to in the near future since MCTS has not been applied much in the NCO literature, such as routing and scheduling. I remember some work that did, but other methods outperformed it without MCTS.
I think that is an interesting direction, and we gladly accept contributions! If you are interested, I invite you to give it a shot :) What kind of problem would you like to use it for?
Describe the bug
I used the quickstart example to train TSP of size 20 in a conda environment on a gpu connected via SSH in VS code Jupyter notebook. But when I am trying to train a new TSP model of size 50 in the same folder from scratch and trying to load the render the dataset , I am still getting dataset of size 20 instead of 50. The model trains fine but even the below lines seem to only generate Dataset of size 20 instead of 50:
env = TSPEnv( num_loc = 50) new_dataset = env.dataset(10000) dataloader = model._dataloader(new_dataset, batch_size=100) print(new_dataset[0]["locs"].shape)
Output: Unused keyword arguments: num_loc. Please check the documentation for the correct keyword arguments torch.Size([20, 2])
I tried running a new jupyter notebook of size 50 in a new folder but the problem still persists. Can you please helop me identify the issue?