allenai / allennlp

An open-source NLP research library, built on PyTorch.
http://www.allennlp.org
Apache License 2.0
11.76k stars 2.25k forks source link

idea of worker_error in Iter(queue.get, (None,None)) #5517

Closed TrieuLe0801 closed 2 years ago

TrieuLe0801 commented 2 years ago
 for instance, worker_error in iter(queue.get, (None, None)):
                if worker_error is not None:
                    e, tb = worker_error
                    raise WorkerError(e, tb)
                yield instance

I always get None for worker_error in all case. I think the worker_error is working as the wrong idea. Can you explain it?

AkshitaB commented 2 years ago

@TrieuLe0801 Can you give more information? What allennlp version are you using, and what devices are you trying to distribute over?

TrieuLe0801 commented 2 years ago

Hi, thank you for your response, I am using the lastest version of Allennlp, 2.́8.0 with python 3.8. My laptop has 16gb RAM, CPU i7-7700HQ, and it has GTX 1060 also.

AkshitaB commented 2 years ago

@TrieuLe0801 And why do you expect/want to run into the worker error?

TrieuLe0801 commented 2 years ago

@AkshitaB I tried to run test cases in the test folder to understand which case that I have had to avoid.

github-actions[bot] commented 2 years ago

@AkshitaB this is just a friendly ping to make sure you haven't forgotten about this issue 😜

dirkgr commented 2 years ago

@TrieuLe0801, getting None is a good thing. It means nothing went wrong in your worker. Why would you expect anything but None there?

TrieuLe0801 commented 2 years ago

@TrieuLe0801, getting None is a good thing. It means nothing went wrong in your worker. Why would you expect anything but None there?

I try making an error to understand what error the project is trying to avoid. But it always return error = None, so I think it is redundant to check raise WorkerError(e, tb). It is my opinion through my experiment, I am not sure it is right or wrong. Can you guys help me explain it?

github-actions[bot] commented 2 years ago

@AkshitaB this is just a friendly ping to make sure you haven't forgotten about this issue 😜

dirkgr commented 2 years ago

This queue is written to from the MultiProcessDataLoader. Look at MultiProcessDataLoader._instance_worker(). If anything goes wrong in that method, it puts the error onto that queue.

Try configuring multiprocess data loading, and then write a DatasetReader that throws an exception. That's how the exception from the dataset reader gets propagated from the worker thread to the main thread.

github-actions[bot] commented 2 years ago

@AkshitaB this is just a friendly ping to make sure you haven't forgotten about this issue 😜

AkshitaB commented 2 years ago

@TrieuLe0801 Closing this issue. Feel free to reopen if required.