Unable to label more than four regions in Audio

DGAzr commented 2 months ago

I am running into a strange issue with an audio transcription task I'm trying to complete. I am able to label four regions of my file without any issue, but as soon as I attempt to label a fifth I get an error and the following stack trace:

Traceback (most recent call last):
  File "/home/op/label-studio/lib/python3.11/site-packages/rest_framework/views.py", line 506, in dispatch
    response = handler(request, *args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/op/label-studio/lib/python3.11/site-packages/rest_framework/generics.py", line 242, in post
    return self.create(request, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/op/label-studio/lib/python3.11/site-packages/rest_framework/mixins.py", line 17, in create
    serializer = self.get_serializer(data=request.data)
                                          ^^^^^^^^^^^^
  File "/home/op/label-studio/lib/python3.11/site-packages/rest_framework/request.py", line 216, in data
    self._load_data_and_files()
  File "/home/op/label-studio/lib/python3.11/site-packages/rest_framework/request.py", line 279, in _load_data_and_files
    self._data, self._files = self._parse()
                              ^^^^^^^^^^^^^
  File "/home/op/label-studio/lib/python3.11/site-packages/rest_framework/request.py", line 329, in _parse
    stream = self.stream
             ^^^^^^^^^^^
  File "/home/op/label-studio/lib/python3.11/site-packages/rest_framework/request.py", line 203, in stream
    self._load_stream()
  File "/home/op/label-studio/lib/python3.11/site-packages/rest_framework/request.py", line 309, in _load_stream
    self._stream = io.BytesIO(self.body)
                              ^^^^^^^^^
  File "/home/op/label-studio/lib/python3.11/site-packages/rest_framework/request.py", line 416, in __getattr__
    return getattr(self._request, attr)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/op/label-studio/lib/python3.11/site-packages/django/http/request.py", line 330, in body
    raise RawPostDataException("You cannot access body after reading from request's data stream")
django.http.request.RawPostDataException: You cannot access body after reading from request's data stream

I noted a user in the Slack support channel experiencing a nearly identical problem with a video task (4 labels are fine, the fifth throws this error) so perhaps it's reproducible in other task types too: Slack Support Thread

labeling configuration:

<View>
    <Audio name="audio" value="$audio"/>
    <Labels name="label" toName="audio">
      <Label value="Speaker 1" background="#f500ff"/>
      <Label value="Speaker 2" background="#ff0300"/>
    </Labels>
   <TextArea  name="Transcript" toName="audio" perRegion="true" showSubmitButton="true" maxSubmissions="1" editable="true" required="true"/>                            
</View>

To Reproduce Steps to reproduce the behavior:

Create new project
Use MP3 file and "perRegion" TextArea (my view code is above)
Label four regions in the file and enter transcripts
Label a fifth region and attempt to enter a transcript
- The error will occur before even submitting the new annotation, seemingly as the app attempts to process text from the fifth transcript

Expected behavior More than four annotations on an audio file would work

Environment (please complete the following information):

Label Studio Version 1.12.1
Installed via PiP on Debian Linux 12.6 (fully updated), also tried the official Docker image

jombooth commented 1 month ago

/jira create

Workflow run Jira issue TRIAG-762 is created

jombooth commented 1 month ago

Thanks for reporting, @DGAzr ! A strange bug indeed, we'll investigate.

HumanSignal / label-studio

Unable to label more than four regions in Audio #6061