facebookresearch / EgoVLPv2

Code release for "EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone" [ICCV, 2023]
MIT License
85 stars 11 forks source link

OSError: handle is closed #10

Closed Dongzhikang closed 7 months ago

Dongzhikang commented 7 months ago

Dear authors,

I have a question about training/testing process. When I finish training one epoch or testing on dataset, I will always encounter the following error. I am using a single A40 GPU, and I have already change n_gpu to 1. Could you please help me with that? Thank you so much!

Exception in thread Thread-1: Traceback (most recent call last): File "/anaconda3/envs/egovlpv2/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/anaconda3/envs/egovlpv2/lib/python3.10/site-packages/tensorboardX/event_file_writer.py", line 202, in run data = self._queue.get(True, queue_wait_duration) File "/anaconda3/envs/egovlpv2/lib/python3.10/multiprocessing/queues.py", line 117, in get res = self._recv_bytes() File "/anaconda3/envs/egovlpv2/lib/python3.10/multiprocessing/connection.py", line 212, in recv_bytes self._check_closed() File "/anaconda3/envs/egovlpv2/lib/python3.10/multiprocessing/connection.py", line 136, in _check_closed raise OSError("handle is closed") OSError: handle is closed

Dongzhikang commented 7 months ago

I solved this problem by adding if args.rank == 0: writer.close() after trainer.train(gpu)