microsoft / DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Apache License 2.0
1.76k stars 163 forks source link

BUG in run_batch_processing #471

Open zhihui96 opened 1 month ago

zhihui96 commented 1 month ago

https://github.com/microsoft/DeepSpeed-MII/blob/d5468112bffe2b93228bb9f6f16aef84029a3d30/mii/batching/postprocess.py#L39-L41

Here may mean

idx_list.extend(unprocessed_idx) 

Otherwise, it may lead to following error

Exception in thread Thread-1:
Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/opt/conda/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/conda/lib/python3.8/site-packages/mii/batching/ragged_batching.py", line 650, in __call__
    self.generate()
  File "/opt/conda/lib/python3.8/site-packages/mii/batching/utils.py", line 31, in wrapper
    return func(self, *args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/mii/batching/ragged_batching.py", line 116, in generate
    next_tokens, done_tokens = self._process_logits(
  File "/opt/conda/lib/python3.8/site-packages/mii/batching/utils.py", line 18, in wrapper
    result = func(self, *args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/mii/batching/ragged_batching.py", line 187, in _process_logits
    next_token_logits = self.logit_processor(next_token_logits,
  File "/opt/conda/lib/python3.8/site-packages/mii/batching/postprocess.py", line 60, in run_batch_logit_processing
    output_logits = run_batch_processing(output_logits, requests, fns)
  File "/opt/conda/lib/python3.8/site-packages/mii/batching/postprocess.py", line 46, in run_batch_processing
    return output[torch.argsort(torch.tensor(idx_list))]
TypeError: an integer is required (got type list)