I'm currently experimenting with a variation of the code provided in the ffcv-imagenet repo. And it happens when I do repeatedly instantiate the ImageNetTrainer class and perform multiple trainings after each other, I encounter this error before and during the second training:
Exception ignored in: <finalize object at 0x7f8ee0524e20; dead>
Traceback (most recent call last):
File "/home/thomas/conda/envs/ffcv/lib/python3.9/weakref.py", line 591, in __call__
return info.func(*info.args, **(info.kwargs or {}))
File "/home/thomas/conda/envs/ffcv/lib/python3.9/site-packages/numba/core/dispatcher.py", line 312, in finalizer
for cres in overloads.values():
KeyError: (array(uint8, 1d, C), array(uint8, 1d, C), uint32, uint32, uint32, uint32, Literal[int](0), Literal[int](0), Literal[int](1), Literal[int](1), Literal[bool](False), Literal[bool](False))
ep=0, iter=11, shape=(64, 3, 160, 160), lrs=['0.001', '0.001']: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:12<00:00, 1.01s/it]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 24/24 [00:03<00:00, 6.44it/s]
It seems like it doesn't affect training and everything still works as expected. I suspect it refers to some dead objects that couldn't get cleaned up properly.
Unfortunately I don't have a minimal code example ready to debug this, because it requires quite a bit of work.
If there's anything I can do to help you track down the source of this error, let me know.
I'm currently experimenting with a variation of the code provided in the ffcv-imagenet repo. And it happens when I do repeatedly instantiate the ImageNetTrainer class and perform multiple trainings after each other, I encounter this error before and during the second training:
It seems like it doesn't affect training and everything still works as expected. I suspect it refers to some dead objects that couldn't get cleaned up properly. Unfortunately I don't have a minimal code example ready to debug this, because it requires quite a bit of work. If there's anything I can do to help you track down the source of this error, let me know.