microsoft / DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Apache License 2.0
1.84k stars 173 forks source link

Invalid parameter bricks service #338

Open mevince opened 9 months ago

mevince commented 9 months ago
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/opt/conda/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/conda/lib/python3.10/site-packages/mii/batching/ragged_batching.py", line 554, in __call__
    self.generate()
  File "/opt/conda/lib/python3.10/site-packages/mii/batching/utils.py", line 31, in wrapper
    return func(self, *args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/mii/batching/ragged_batching.py", line 135, in generate
    next_tokens, done_tokens = self._process_logits(
  File "/opt/conda/lib/python3.10/site-packages/mii/batching/utils.py", line 18, in wrapper
    result = func(self, *args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/mii/batching/ragged_batching.py", line 206, in _process_logits
    next_token_logits = self.logit_processor(next_token_logits,
  File "/opt/conda/lib/python3.10/site-packages/mii/batching/postprocess.py", line 60, in run_batch_logit_processing
    output_logits = run_batch_processing(output_logits, requests, fns)
  File "/opt/conda/lib/python3.10/site-packages/mii/batching/postprocess.py", line 32, in run_batch_processing
    output_list.append(process_fn(filtered_input))
  File "/opt/conda/lib/python3.10/site-packages/mii/batching/generation/logit_processors.py", line 16, in __call__
    return self.forward(logits)
  File "/opt/conda/lib/python3.10/site-packages/mii/batching/generation/logit_processors.py", line 33, in forward
    indices_to_remove = logits < torch.topk(logits, self.top_k)[0][..., -1, None]
IndexError: index -1 is out of bounds for dimension 1 with size 0

Passing in top_k=0 (invalid value) will raise an exception and will never continue on the main thread. I am unable to call any further functions using the client, bricking the service.

I then restarted the same python script, however I run into the issue

Exception in thread Thread-3 (_serve):
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/opt/conda/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/conda/lib/python3.10/site-packages/grpc/_server.py", line 1212, in _serve
    if not _process_event_and_continue(state, event):
  File "/opt/conda/lib/python3.10/site-packages/grpc/_server.py", line 1172, in _process_event_and_continue
    rpc_state, rpc_future = _handle_call(
  File "/opt/conda/lib/python3.10/site-packages/grpc/_server.py", line 1049, in _handle_call
    return _handle_with_method_handler(
  File "/opt/conda/lib/python3.10/site-packages/grpc/_server.py", line 1000, in _handle_with_method_handler
    return state, _handle_unary_unary(
  File "/opt/conda/lib/python3.10/site-packages/grpc/_server.py", line 845, in _handle_unary_unary
    return thread_pool.submit(
  File "/opt/conda/lib/python3.10/concurrent/futures/thread.py", line 167, in submit
    raise RuntimeError('cannot schedule new futures after shutdown')
RuntimeError: cannot schedule new futures after shutdown

when trying to call client functions. Is there some state/resource that persists if we are unable to call client.terminate()?

mrwyattii commented 9 months ago

There is likely a process that did not shutdown. If possible, can you see what python processes are running (ps aux | grep python) and kill those manually?

Also, this points to the fact that we need better value checking on the generate kwargs. I will get a PR together that handles invalid values more gracefully.