Open liamkboyle opened 4 months ago
Hi guys, I was trying to train the TID network using the train.py script with the tid_train.yaml config. I downloaded the dsec dataset and placed it in the correct folders as per the readme. While training I noticed that the training crashes after some iterations with an error in the retrieval_fn.py: https://github.com/tudelft/idnet/blob/6e9ade0c1e3c4458528201eeb1445db9d4c6b2d5/idn/utils/retrieval_fn.py#L45-L48
train.py
tid_train.yaml
retrieval_fn.py
It seems that the retrieval function is expecting there to be a gt_flow_next for all elements of the batch but when I check the dataloader here: https://github.com/tudelft/idnet/blob/6e9ade0c1e3c4458528201eeb1445db9d4c6b2d5/idn/loader/loader_dsec.py#L396-L404 it will occasionally load a sample at the end of the recording for which there is no gt_flow_next.
gt_flow_next
By changing the +1 to a -1 in line 524 here the training runs without errors: https://github.com/tudelft/idnet/blob/6e9ade0c1e3c4458528201eeb1445db9d4c6b2d5/idn/loader/loader_dsec.py#L520-L534
+1
-1
Could it be that this was a typo in the code or am I missing something?
Hi,
You are probably right! Apologies this error has made into the release. I will take a look at it towards the end of this week.
Thanks for your understanding!
Hi guys, I was trying to train the TID network using the
train.py
script with thetid_train.yaml
config. I downloaded the dsec dataset and placed it in the correct folders as per the readme. While training I noticed that the training crashes after some iterations with an error in theretrieval_fn.py
: https://github.com/tudelft/idnet/blob/6e9ade0c1e3c4458528201eeb1445db9d4c6b2d5/idn/utils/retrieval_fn.py#L45-L48It seems that the retrieval function is expecting there to be a
gt_flow_next
for all elements of the batch but when I check the dataloader here: https://github.com/tudelft/idnet/blob/6e9ade0c1e3c4458528201eeb1445db9d4c6b2d5/idn/loader/loader_dsec.py#L396-L404 it will occasionally load a sample at the end of the recording for which there is nogt_flow_next
.By changing the
+1
to a-1
in line 524 here the training runs without errors: https://github.com/tudelft/idnet/blob/6e9ade0c1e3c4458528201eeb1445db9d4c6b2d5/idn/loader/loader_dsec.py#L520-L534Could it be that this was a typo in the code or am I missing something?