deepset-ai / FARM

:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
https://farm.deepset.ai
Apache License 2.0
1.74k stars 247 forks source link

Error on inference for batch of size 1 #350

Closed qgroshens closed 4 years ago

qgroshens commented 4 years ago

Describe the bug When performing inference using inference_from_dicts and a dictionary batch of size 1, we get a torch error linked to batch normalization.

Error message

ValueError                                Traceback (most recent call last)
<ipython-input-29-8df0858bd894> in <module>
----> 1 result = inferencer.inference_from_dicts(dicts=[{"text": "stress"}])

~/qg_dev/pytorch_env/lib/python3.6/site-packages/farm/infer.py in inference_from_dicts(self, dicts, rest_api_schema, multiprocessing_chunksize, streaming)
    384             # return a generator object if streaming is enabled, else, cast the generator to a list.
    385             if not streaming and type(predictions) != list:
--> 386                 return list(predictions)
    387             else:
    388                 return predictions

~/qg_dev/pytorch_env/lib/python3.6/site-packages/farm/infer.py in _inference_with_multiprocessing(self, dicts, rest_api_schema, aggregate_preds, multiprocessing_chunksize)
    452                 )
    453             else:
--> 454                 predictions = self._get_predictions(dataset, tensor_names, baskets, rest_api_schema)
    455             yield from predictions
    456 

~/qg_dev/pytorch_env/lib/python3.6/site-packages/farm/infer.py in _get_predictions(self, dataset, tensor_names, baskets, rest_api_schema)
    492             # get logits
    493             with torch.no_grad():
--> 494                 logits = self.model.forward(**batch)[0]
    495                 preds = self.model.formatted_preds(
    496                     logits=[logits],

~/qg_dev/pytorch_env/lib/python3.6/site-packages/farm/modeling/adaptive_model.py in forward(self, **kwargs)
    392 
    393         # Run forward pass of language model
--> 394         sequence_output, pooled_output = self.forward_lm(**kwargs)
    395 
    396         # Run forward pass of (multiple) prediction heads using the output from above

~/qg_dev/pytorch_env/lib/python3.6/site-packages/farm/modeling/adaptive_model.py in forward_lm(self, **kwargs)
    434         # Run forward pass of language model
    435         if extraction_layer == -1:
--> 436             sequence_output, pooled_output = self.language_model(**kwargs, output_all_encoded_layers=False)
    437         else:
    438             # get output from an earlier layer

~/qg_dev/pytorch_env/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

~/qg_dev/pytorch_env/lib/python3.6/site-packages/farm/modeling/language_model.py in forward(self, input_ids, **kwargs)
   1098         pooled_output = torch.stack(pooled_output)
   1099         m = nn.BatchNorm1d(pooled_output.shape[1]) # batchnorm for stable learning
-> 1100         pooled_output = m(pooled_output)
   1101         return sequence_output, pooled_output
   1102 

~/qg_dev/pytorch_env/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    530             result = self._slow_forward(*input, **kwargs)
    531         else:
--> 532             result = self.forward(*input, **kwargs)
    533         for hook in self._forward_hooks.values():
    534             hook_result = hook(self, input, result)

~/qg_dev/pytorch_env/lib/python3.6/site-packages/torch/nn/modules/batchnorm.py in forward(self, input)
    105             input, self.running_mean, self.running_var, self.weight, self.bias,
    106             self.training or not self.track_running_stats,
--> 107             exponential_average_factor, self.eps)
    108 
    109 

~/qg_dev/pytorch_env/lib/python3.6/site-packages/torch/nn/functional.py in batch_norm(input, running_mean, running_var, weight, bias, training, momentum, eps)
   1664             size_prods *= size[i + 2]
   1665         if size_prods == 1:
-> 1666             raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size))
   1667 
   1668     return torch.batch_norm(

ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256])

Expected behavior inference_from_dicts should return the inference of the text provided even if the size of the batch provided is 1

Additional context Using custom WordEmbedding_LM model

To Reproduce

tokenizer = Tokenizer.load(pretrained_model_name_or_path=fasttext_model, do_lower_case=do_lower_case)
processor = InferenceProcessor(tokenizer=tokenizer, max_seq_len=128)
language_model = WordEmbedding_LM.load(fasttext_model)
model = AdaptiveModel(
        language_model=language_model,
        prediction_heads=[],
        embeds_dropout_prob=0.1,
        lm_output_types=["per_sequence"],
        device=device)
inferencer = Inferencer(model=model, processor=processor, task_type="embeddings", gpu=True,extraction_strategy = "reduce_mean",
                       batch_size=1, extraction_layer=-1, num_processes=1)
result = inferencer.inference_from_dicts(dicts=[{"text": "stress"}])

running

result = inferencer.inference_from_dicts(dicts=[{"text": "stress"},{"text": "stress"}])

still return the error (due to batch_size=1?)

inferencer = Inferencer(model=model, processor=processor, task_type="embeddings", gpu=True,extraction_strategy = "reduce_mean",
                       batch_size=2, extraction_layer=-1, num_processes=1)
result = inferencer.inference_from_dicts(dicts=[{"text": "stress"},{"text": "stress"}])

works fine System:

Timoeller commented 4 years ago

Hey @qgroshens thanks for catching this bug. The problem is coming from batchnormalization in combination with batch size 1 - we cannot normalize one sample to have zero mean and unit variance...

This small commit fixes the problem. The changes are on current master but not in an official release, e.g. FARM version 0.4.3. Hope this solves the problem on your end.

Timoeller commented 4 years ago

Issue fixed, closing now