alexfrom0815 / Online-3D-BPP-DRL

This repository contains the implementation of paper Online 3D Bin Packing with Constrained Deep Reinforcement Learning.
292 stars 66 forks source link

ValueError: cannot reshape array of size 1600 into shape (10,10) #4

Closed Chengjlzzz closed 2 years ago

Chengjlzzz commented 2 years ago

Traceback (most recent call last): File "E:/User002/Online-3D-BPP-DRL-main1/main.py", line 248, in main(args) File "E:/User002/Online-3D-BPP-DRL-main1/main.py", line 40, in main train_model() File "E:/User002/Online-3D-BPP-DRL-main1/main.py", line 135, in train_model box_mask = get_possible_position(observation, config.container_size) File "E:\User002\Online-3D-BPP-DRL-main1\acktr\utils.py", line 47, in get_possible_position plain = box_info[0].reshape((container_size[0], container_size[1])) ValueError: cannot reshape array of size 1600 into shape (10,10)

alexfrom0815 commented 2 years ago

Hello, have you changed any parameters in 'config.py' or run this program according to the instructions in 'readme.md'? The original program should be able to run. If you modify the original configurations, please list these changes so that we can more accurately determine where the error occurred.

Chengjlzzz commented 2 years ago

Hello, have you changed any parameters in 'config.py' or run this program according to the instructions in 'readme.md'? The original program should be able to run. If you modify the original configurations, please list these changes so that we can more accurately determine where the error occurred.

Sorry to bother. I followed the instructions in 'readme.md' for training :" --mode train --load-model --use-cuda --item-seq sample" , and I got a different problem : D:\Envs\miniconda3\envs\pt11\lib\site-packages\torch\nn_reduction.py:46: UserWarning: size_average and reduce args will be deprecated, please use reduction='none' instead. warnings.warn(warning.format(ret)) Traceback (most recent call last): File "E:/User002/Online-3D-BPP-DRL-main/main.py", line 247, in main(args) File "E:/User002/Online-3D-BPP-DRL-main/main.py", line 39, in main train_model() File "E:/User002/Online-3D-BPP-DRL-main/main.py", line 198, in train_model value_loss, action_loss, dist_entropy, prob_loss, graph_loss = agent.update(rollouts) File "E:\User002\Online-3D-BPP-DRL-main\acktr\algo\acktr_pipeline.py", line 50, in update rollouts.location_masks[:-1].view(-1, mask_size)) File "E:\User002\Online-3D-BPP-DRL-main\acktr\model.py", line 93, in evaluate_actions value, actor_features, rnn_hxs, graph = self.base(inputs, rnn_hxs, masks) File "D:\Envs\miniconda3\envs\pt11\lib\site-packages\torch\nn\modules\module.py", line 493, in call result = self.forward(*input, kwargs) File "E:\User002\Online-3D-BPP-DRL-main\acktr\model.py", line 322, in forward hidden_critic = self.critic(share) File "D:\Envs\miniconda3\envs\pt11\lib\site-packages\torch\nn\modules\module.py", line 493, in call result = self.forward(*input, *kwargs) File "D:\Envs\miniconda3\envs\pt11\lib\site-packages\torch\nn\modules\container.py", line 92, in forward input = module(input) File "D:\Envs\miniconda3\envs\pt11\lib\site-packages\torch\nn\modules\module.py", line 493, in call result = self.forward(input, kwargs) File "E:\User002\Online-3D-BPP-DRL-main\acktr\algo\kfac.py", line 81, in forward x = self.module(input) File "D:\Envs\miniconda3\envs\pt11\lib\site-packages\torch\nn\modules\module.py", line 489, in call hook(self, input) File "E:\User002\Online-3D-BPP-DRL-main\acktr\algo\kfac.py", line 156, in _save_input self.fast_cnn) File "E:\User002\Online-3D-BPP-DRL-main\acktr\algo\kfac.py", line 45, in compute_cov_a return a.t() @ (a / batch_size) RuntimeError: cublas runtime error : the GPU program failed to execute at C:/w/1/s/tmp_conda_3.6_035809/conda/conda-bld/pytorch_1556683229598/work/aten/src/THC/THCBlas.cu:259

Process finished with exit code 1

my environment in my PC is: OS win10 PyTorch version: 1.1.0 CUDA VERSION 10.0 CUDNN VERSION 7401 python 3.6.13 GPU NVIDIA GeForce RTX3070Ti I can run the test successfully . What should I do?

alexfrom0815 commented 2 years ago

I am deeply sorry for the new problem. I can run this code in the Ububtu 16.04 system. I think there may be two reasons for the new problem. One is the mismatch between the version of your Cuda and PyTorch, and another may be due to the ACKTR algorithm we borrowed from the repository (https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail) may have some problems in the Windows system. I have added the a2c implementation (switch ACKTR to a2c in 'config.py'). Some instructions in 'readme.md' are also changed. Compared to ACKTR, the a2c implementation does not need to call ‘kfac.py’ where the error occurs. You can try to see if a2c can run in your system. If even the basic a2c implementation does not work, it means that your Cuda and PyTorch do not match. If the a2c algorithm can run, it means that the ACKTR algorithm cannot run in Windows.

Chengjlzzz commented 2 years ago

I am deeply sorry for the new problem. I can run this code in the Ububtu 16.04 system. I think there may be two reasons for the new problem. One is the mismatch between the version of your Cuda and PyTorch, and another may be due to the ACKTR algorithm we borrowed from the repository (https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail) may have some problems in the Windows system. I have added the a2c implementation (switch ACKTR to a2c in 'config.py'). Some instructions in 'readme.md' are also changed. Compared to ACKTR, the a2c implementation does not need to call ‘kfac.py’ where the error occurs. You can try to see if a2c can run in your system. If even the basic a2c implementation does not work, it means that your Cuda and PyTorch do not match. If the a2c algorithm can run, it means that the ACKTR algorithm cannot run in Windows.

Sorry to bother you again. I wonder if the exact coordinates of every packed box (the coordinate of front-left-bottom (FLB) corner of box) can be print from the program? How can I get that? Thanks a lot for your patience.

alexfrom0815 commented 2 years ago

I am deeply sorry for the new problem. I can run this code in the Ububtu 16.04 system. I think there may be two reasons for the new problem. One is the mismatch between the version of your Cuda and PyTorch, and another may be due to the ACKTR algorithm we borrowed from the repository (https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail) may have some problems in the Windows system. I have added the a2c implementation (switch ACKTR to a2c in 'config.py'). Some instructions in 'readme.md' are also changed. Compared to ACKTR, the a2c implementation does not need to call ‘kfac.py’ where the error occurs. You can try to see if a2c can run in your system. If even the basic a2c implementation does not work, it means that your Cuda and PyTorch do not match. If the a2c algorithm can run, it means that the ACKTR algorithm cannot run in Windows.

Sorry to bother you again. I wonder if the exact coordinates of every packed box (the coordinate of front-left-bottom (FLB) corner of box) can be print from the program? How can I get that? Thanks a lot for your patience.

You can print the 'lx' and 'ly' of the 'drop_box' function in the 'bpp0/space.py'. These two variables are the coordinates of packed boxes.

Chengjlzzz commented 2 years ago

I am deeply sorry for the new problem. I can run this code in the Ububtu 16.04 system. I think there may be two reasons for the new problem. One is the mismatch between the version of your Cuda and PyTorch, and another may be due to the ACKTR algorithm we borrowed from the repository (https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail) may have some problems in the Windows system. I have added the a2c implementation (switch ACKTR to a2c in 'config.py'). Some instructions in 'readme.md' are also changed. Compared to ACKTR, the a2c implementation does not need to call ‘kfac.py’ where the error occurs. You can try to see if a2c can run in your system. If even the basic a2c implementation does not work, it means that your Cuda and PyTorch do not match. If the a2c algorithm can run, it means that the ACKTR algorithm cannot run in Windows.

Sorry to bother you again. I wonder if the exact coordinates of every packed box (the coordinate of front-left-bottom (FLB) corner of box) can be print from the program? How can I get that? Thanks a lot for your patience.

You can print the 'lx' and 'ly' of the 'drop_box' function in the 'bpp0/space.py'. These two variables are the coordinates of packed boxes.

sorry,I'm I want to print the (x,y,z) coordinates of every packed item while training new model and testing a trained model , but I failed to achieve it . I tried to add : print (lx, ly, new_h) in the 'drop_box' function in the 'bpp0/space.py', but there is still no coordinates output . How should I change the main or other programs to achieve this : Once I run the main for training or testing , the output will print the (x,y,z) coordinates of next packed box.

alexfrom0815 commented 2 years ago

Please confirm whether you have modified the bpp0/space.py under the gym path you registered. You can set breakpoints in the program to check whether the program runs correctly.

I am deeply sorry for the new problem. I can run this code in the Ububtu 16.04 system. I think there may be two reasons for the new problem. One is the mismatch between the version of your Cuda and PyTorch, and another may be due to the ACKTR algorithm we borrowed from the repository (https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail) may have some problems in the Windows system. I have added the a2c implementation (switch ACKTR to a2c in 'config.py'). Some instructions in 'readme.md' are also changed. Compared to ACKTR, the a2c implementation does not need to call ‘kfac.py’ where the error occurs. You can try to see if a2c can run in your system. If even the basic a2c implementation does not work, it means that your Cuda and PyTorch do not match. If the a2c algorithm can run, it means that the ACKTR algorithm cannot run in Windows.

Sorry to bother you again. I wonder if the exact coordinates of every packed box (the coordinate of front-left-bottom (FLB) corner of box) can be print from the program? How can I get that? Thanks a lot for your patience.

You can print the 'lx' and 'ly' of the 'drop_box' function in the 'bpp0/space.py'. These two variables are the coordinates of packed boxes.

sorry,I'm I want to print the (x,y,z) coordinates of every packed item while training new model and testing a trained model , but I failed to achieve it . I tried to add : print (lx, ly, new_h) in the 'drop_box' function in the 'bpp0/space.py', but there is still no coordinates output . How should I change the main or other programs to achieve this : Once I run the main for training or testing , the output will print the (x,y,z) coordinates of next packed box.

Please confirm whether you have modified the bpp0/space.py under the GYM PATH you registered. You can set breakpoints in the program to check whether the program runs correctly.