Open BazUCD opened 2 years ago
Could you share an example of json file you are using in your dataset. It looks like
p = [point[numb_point][1],point[numb_point][0]]
point looks empty of the dimensions are wrong. @mintar refactored the data format a little bit, I did not check if it was compatible with train2/train.py? But I will try to check soon.
@BazUCD Did you try to use the original training script?
Hi @TontonTremblay thanks for the quick reply. Heres an example of my .json files with the associated png as well:
I've used the original training script and generated some weights but was unable to detect anything so after your recommendation from #238 I have been trying to generate the belief maps using train2
This looks correct, but your object has a symmetry in it. https://github.com/NVlabs/Deep_Object_Pose/tree/master/scripts/nvisii_data_gen#handling-objects-with-symmetries you should look into this from Martin.
I encountered a similar issue. The training script expects the "projected_cuboid"
field to contain 9 points. The last point being the point under"projected_cuboid_centroid"
.
In your case, you can add something like projected_cuboid_keypoints.append(obj['projected_cuboid_centroid'])
right below line 228 in utils_dope.py
. I did this and it worked for me.
Hi I am attempting to run a the training script and generate the belief maps from train2/train.py in order to debug but I am getting this error:
start: 18:18:30.781464 load data: ['/home/user/Downloads/Spanner2'] load data: training data: 2000 batches load models ready to train! Traceback (most recent call last): File "train.py", line 606, in
_runnetwork(epoch,trainingdata)
File "train.py", line 422, in _runnetwork
for batch_idx, targets in enumerate(train_loader):
File "/home/user/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 521, in next
data = self._next_data()
File "/home/user/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1203, in _next_data
return self._process_data(data)
File "/home/user/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1229, in _process_data
data.reraise()
File "/home/user/.local/lib/python3.6/site-packages/torch/_utils.py", line 434, in reraise
raise exception
IndexError: Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/user/.local/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/home/user/.local/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/user/.local/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 49, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/user/catkin_pcl_new/src/dope/scripts/train2/utils_dope.py", line 321, in getitem
save=False,
File "/home/user/catkin_pcl_new/src/dope/scripts/train2/utils_dope.py", line 593, in CreateBeliefMap
p = [point[numb_point][1],point[numb_point][0]]
IndexError: list index out of range
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 17249) of binary: /usr/bin/python3 Traceback (most recent call last): File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/user/.local/lib/python3.6/site-packages/torch/distributed/launch.py", line 193, in
main()
File "/home/user/.local/lib/python3.6/site-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/home/user/.local/lib/python3.6/site-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/home/user/.local/lib/python3.6/site-packages/torch/distributed/run.py", line 713, in run
)(*cmd_args)
File "/home/user/.local/lib/python3.6/site-packages/torch/distributed/launcher/api.py", line 131, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/user/.local/lib/python3.6/site-packages/torch/distributed/launcher/api.py", line 261, in launch_agent
failures=result.failures,
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
train.py FAILED
Failures: