Wordcab / wordcab-transcribe

💬 ASR FastAPI server using faster-whisper and Multi-Scale Auto-Tuning Spectral Clustering for diarization.
https://wordcab.github.io/wordcab-transcribe/
MIT License
198 stars 29 forks source link

batch_size throws an error. #306

Closed yf-chau closed 5 months ago

yf-chau commented 6 months ago

When testing the 0.5.3 server, I ran into the following error when trying to access the API. It seemed to be related to pydantic flagging batch_size value is not correct.

INFO:     Uvicorn running on http://0.0.0.0:5001 (Press CTRL+C to quit)
2024-05-08 10:50:36.590 | INFO     | wordcab_transcribe.logging:dispatch:68 - Task [3e170055-5a74-4412-821d-5b56bdc6c181] | POST http://localhost:5001/api/v1/audio
INFO:     172.17.0.1:34958 - "POST /api/v1/audio HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
  + Exception Group Traceback (most recent call last):
  |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/starlette/_utils.py", line 87, in collapse_excgroups
  |     yield
  |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/starlette/middleware/base.py", line 190, in __call__
  |     async with anyio.create_task_group() as task_group:
  |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 678, in __aexit__
  |     raise BaseExceptionGroup(
  | exceptiongroup.ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
  +-+---------------- 1 ----------------
    | Traceback (most recent call last):
    |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 407, in run_asgi
    |     result = await app(  # type: ignore[func-returns-value]
    |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 69, in __call__
    |     return await self.app(scope, receive, send)
    |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
    |     await super().__call__(scope, receive, send)
    |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/starlette/applications.py", line 123, in __call__
    |     await self.middleware_stack(scope, receive, send)
    |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/starlette/middleware/errors.py", line 186, in __call__
    |     raise exc
    |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/starlette/middleware/errors.py", line 164, in __call__
    |     await self.app(scope, receive, _send)
    |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/starlette/middleware/base.py", line 189, in __call__
    |     with collapse_excgroups():
    |   File "/usr/local/lib/python3.10/contextlib.py", line 153, in __exit__
    |     self.gen.throw(typ, value, traceback)
    |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/starlette/_utils.py", line 93, in collapse_excgroups
    |     raise exc
    |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/starlette/middleware/base.py", line 191, in __call__
    |     response = await self.dispatch_func(request, call_next)
    |   File "/app/src/wordcab_transcribe/logging.py", line 72, in dispatch
    |     response = await call_next(request)
    |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/starlette/middleware/base.py", line 165, in call_next
    |     raise app_exc
    |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/starlette/middleware/base.py", line 151, in coro
    |     await self.app(scope, receive_or_disconnect, send_no_error)
    |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
    |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
    |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    |     raise exc
    |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    |     await app(scope, receive, sender)
    |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/starlette/routing.py", line 758, in __call__
    |     await self.middleware_stack(scope, receive, send)
    |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/starlette/routing.py", line 778, in app
    |     await route.handle(scope, receive, send)
    |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/starlette/routing.py", line 299, in handle
    |     await self.app(scope, receive, send)
    |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/starlette/routing.py", line 79, in app
    |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
    |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    |     raise exc
    |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    |     await app(scope, receive, sender)
    |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/starlette/routing.py", line 74, in app
    |     response = await func(request)
    |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/fastapi/routing.py", line 278, in app
    |     raw_response = await run_endpoint_function(
    |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
    |     return await dependant.call(**values)
    |   File "/app/src/wordcab_transcribe/router/v1/audio_file_endpoint.py", line 76, in inference_with_audio
    |     data = AudioRequest(
    |   File "/root/.local/share/hatch/env/virtual/wordcab-transcribe/9TtSrW0h/runtime/lib/python3.10/site-packages/pydantic/main.py", line 171, in __init__
    |     self.__pydantic_validator__.validate_python(data, self_instance=self)
    | pydantic_core._pydantic_core.ValidationError: 1 validation error for AudioRequest
    | batch_size
    |   Input should be a valid integer [type=int_type, input_value=None, input_type=NoneType]
    |     For further information visit https://errors.pydantic.dev/2.6/v/int_type
    +------------------------------------

There was a log before the chunk which says 2024-05-08 10:34:05.913 | INFO | wordcab_transcribe.services.diarization.diarize_service:__init__:95 - segmentation_batch_size set to None

Any advice on debugging will be welcomed.

aleksandr-smechov commented 6 months ago

@yf-chau Hey! Apologies for the late reply. You can try setting batch_size to 1 but I will fix this in future updates.