Closed jmlongriver closed 4 years ago
Due to multiprocessing, the printed "random_samples" may not be the cause of the exception. Could you please set 'sampling_processes = 0' in the config file (e.g. 'example_train.conf' or whichever you are using) and re-run the experiment?
I suspect that there is a training sample in your dataset which does not contain any possible negative entity (e.g. an empty string or a string where all subsequences are also positive entities). We currently do not handle this corner case.
you are right, I also found that issue, can I simply put extra if statement to evaluate if the random sample result is empty, if it is empty, then don't do the unzip ?
you are right, I also found that issue, can I simply put extra if statement to evaluate if the random sample result is empty, if it is empty, then don't do the unzip ?
Yes, this works in the case of sentences where all subsequences are entities. I committed this corner case handling in https://github.com/markus-eberts/spert/commit/53852345465eb2caddc81b939c05f0d42a82b0f1 .
Please verify if this works for you. Still, if you have samples with empty sentences (no tokens) in your training set you should remove these.
I run the program on my dataset, got the error like as follows, according to the error, there is something wrong with the unzip function, so I print out all the random samples and unzip result, however, I cannot find any error, wonder if you have any hints on it? Thanks
random samples: [((5, 8), 1), ((3, 9), 4), ((8, 9), 1), ((4, 5), 1), ((1, 9), 6), ((2, 5), 3), ((2, 4), 2), ((4, 9), 3), ((2, 8), 4), ((3, 4), 1), ((2, 9), 5), ((2, 3), 1), ((1, 8), 5), ((5, 9), 2), ((3, 5), 2), ((1, 3), 2), ((1, 5), 4), ((3, 8), 3), ((1, 2), 1), ((1, 4), 3)] unzip result ((5, 8), (3, 9), (8, 9), (4, 5), (1, 9), (2, 5), (2, 4), (4, 9), (2, 8), (3, 4), (2, 9), (2, 3), (1, 8), (5, 9), (3, 5), (1, 3), (1, 5), (3, 8), (1, 2), (1, 4)) (1, 4, 1, 1, 6, 3, 2, 3, 4, 1, 5, 1, 5, 2, 2, 2, 4, 3, 1, 3) Train epoch 0: 0%|▎ | 13/3980 [00:04<22:55, 2.88it/s]/lrlhps/apps/python/python-3.6.5/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 1 leaked semaphores to clean up at shutdown len(cache)) /lrlhps/apps/python/python-3.6.5/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 1 leaked semaphores to clean up at shutdown len(cache)) /lrlhps/apps/python/python-3.6.5/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 1 leaked semaphores to clean up at shutdown len(cache)) /lrlhps/apps/python/python-3.6.5/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 1 leaked semaphores to clean up at shutdown len(cache)) Process SpawnProcess-1: /lrlhps/apps/python/python-3.6.5/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 1 leaked semaphores to clean up at shutdown len(cache)) multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/lrlhps/apps/python/python-3.6.5/lib/python3.6/multiprocessing/pool.py", line 119, in worker result = (True, func(*args, *kwds)) File "/lrlhps/users/c272987/spert/spert/sampling.py", line 237, in _produce_train_batch sample = _create_train_sample(d, neg_entity_count, neg_rel_count, max_span_size, context_size) File "/lrlhps/users/c272987/spert/spert/sampling.py", line 296, in _create_train_sample neg_entity_spans, neg_entity_sizes = zip(random_samples) ValueError: not enough values to unpack (expected 2, got 0) """
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/lrlhps/apps/python/python-3.6.5/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/lrlhps/apps/python/python-3.6.5/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/lrlhps/users/c272987/spert/spert.py", line 12, in train types_path=run_args.types_path, input_reader_cls=input_reader.JsonInputReader) File "/lrlhps/users/c272987/spert/spert/spert_trainer.py", line 111, in train input_reader.context_size, input_reader.relation_type_count) File "/lrlhps/users/c272987/spert/spert/spert_trainer.py", line 182, in _train_epoch for batch in tqdm(sampler, total=total, desc='Train epoch %s' % epoch): File "/lrlhps/users/c272987/spert/env/lib/python3.6/site-packages/tqdm/_tqdm.py", line 955, in iter for obj in iterable: File "/lrlhps/users/c272987/spert/spert/sampling.py", line 155, in next_ batch, = self._results.next() File "/lrlhps/apps/python/python-3.6.5/lib/python3.6/multiprocessing/pool.py", line 735, in next raise value ValueError: not enough values to unpack (expected 2, got 0)