About inference error - Githubissues

xm-W commented 8 months ago

Thank you for your public work! Your work has greatly inspired our research. However, when running 'python generate.py ***', it displayed the following error:

ValueError: Caught ValueError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/a40/.conda/envs/DL_bev/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop data = fetcher.fetch(index) File "/home/a40/.conda/envs/DL_bev/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 58, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/a40/.conda/envs/DL_bev/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 58, in data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/a40/SSD1/BEVGen/multi_view_generation/bev_utils/argoverse.py", line 46, in getitem raise ValueError() ValueError

The sequence of errors we encountered is as follows: 1、On line 549 of "BEVGen/multi_view_generation/bev_utils/argoverse_multi_sensor_dataloader.py," with target_timestamp_ns='NaT', this leads to the execution of "return None" on line 552 (missing 'ring_front_center').

# Grab the synchronization record.
target_timestamp_ns = src_to_target_records.loc[src_timedelta_ns, target_sensor_name]
if pd.isna(target_timestamp_ns):
# No match was found within tolerance.
    return None

2、In "BEVGen/multi_view_generation/bev_utils/argoverse.py" on line 246, there is an issue with the conditions: len(data.synchronized_imagery) == 2 but len(self.cameras) == 3. This ultimately leads to an error.

if len(data.synchronized_imagery) != len(self.cameras):
    raise ValueError()

Have you encountered such issues before? Is it a problem arising during the dataset processing? I'm looking forward to receiving your reply.

alexanderswerdlow commented 8 months ago

Thank you for the kind words!

Sorry you encountered this error, I don't remember having it myself, it is likely that I added this check just to be safe.

If this only happens during training to a small subset of the data, you can try something like this to bypass that one element:

return self.__getitem__((index + 1) % len(self))

It is expected that a few timestamps from argoverse don't have a close enough frame from a camera, although I can't remember exactly where that should be handled. You could also try changing the tolerance here that could alleviate this [at the risk of frames that are more out-of-sync].

xm-W commented 8 months ago

Thank you very much for your response. We did manage to solve the issue by bypassing that element. Once again, I appreciate your helpfulness.

alexanderswerdlow / BEVGen

About inference error #6