rmusser01 / tldw

tl/dw (Too Long, Didn't Watch): Your Personal Research Multi-Tool - a naive attempt at 'A Young Lady's Illustrated Primer'
https://tldwproject.com
Apache License 2.0
395 stars 12 forks source link

transcribe a mp3 audio file #412

Open jcl2023 opened 3 weeks ago

jcl2023 commented 3 weeks ago

I got the following error when uploading a mp3 file to transcribe:

error "Invalid file type. Please upload a file that is one of these formats: ['audio/*']"

rmusser01 commented 3 weeks ago

I'm unable to replicate this issue. I am able to successfully upload and transcribe MP3 Files.

Please post the full console output when running the application using python summarize.py -gui -log DEBUG

jcl2023 commented 3 weeks ago

To create a public link, set share=True in launch(). DEBUG:multipart.multipart:Calling on_part_begin with no data DEBUG:multipart.multipart:Calling on_header_field with data[42:61] DEBUG:multipart.multipart:Calling on_header_value with data[63:126] DEBUG:multipart.multipart:Calling on_header_end with no data DEBUG:multipart.multipart:Calling on_header_field with data[128:140] DEBUG:multipart.multipart:Calling on_header_value with data[142:152] DEBUG:multipart.multipart:Calling on_header_end with no data DEBUG:multipart.multipart:Calling on_headers_finished with no data DEBUG:multipart.multipart:Calling on_part_data with data[156:23360] DEBUG:multipart.multipart:Calling on_part_data with data[0:45260] DEBUG:multipart.multipart:Calling on_part_data with data[0:75268] DEBUG:multipart.multipart:Calling on_part_data with data[0:73648] DEBUG:multipart.multipart:Calling on_part_data with data[0:54180] DEBUG:multipart.multipart:Calling on_part_data with data[0:70080] DEBUG:multipart.multipart:Calling on_part_data with data[0:42340] DEBUG:multipart.multipart:Calling on_part_data with data[0:62780] DEBUG:multipart.multipart:Calling on_part_data with data[0:37960] DEBUG:multipart.multipart:Calling on_part_data with data[0:43800] DEBUG:multipart.multipart:Calling on_part_data with data[0:40060] DEBUG:multipart.multipart:Calling on_part_data with data[0:37472] DEBUG:multipart.multipart:Calling on_part_data with data[0:56452] DEBUG:multipart.multipart:Calling on_part_data with data[0:74620] DEBUG:multipart.multipart:Calling on_part_data with data[0:72836] DEBUG:multipart.multipart:Calling on_part_data with data[0:74620] DEBUG:multipart.multipart:Calling on_part_data with data[0:72836] DEBUG:multipart.multipart:Calling on_part_data with data[0:74620] DEBUG:multipart.multipart:Calling on_part_data with data[0:81920] DEBUG:multipart.multipart:Calling on_part_data with data[0:81920] DEBUG:multipart.multipart:Calling on_part_data with data[0:42988] DEBUG:multipart.multipart:Calling on_part_data with data[0:38932] DEBUG:multipart.multipart:Calling on_part_data with data[0:60832] DEBUG:multipart.multipart:Calling on_part_data with data[0:65536] DEBUG:multipart.multipart:Calling on_part_data with data[0:66996] DEBUG:multipart.multipart:Calling on_part_data with data[0:74620] DEBUG:multipart.multipart:Calling on_part_data with data[0:10544] DEBUG:multipart.multipart:Calling on_part_data with data[0:81920] DEBUG:multipart.multipart:Calling on_part_data with data[0:75756] DEBUG:multipart.multipart:Calling on_part_data with data[0:91004] DEBUG:multipart.multipart:Calling on_part_data with data[0:68456] DEBUG:multipart.multipart:Calling on_part_data with data[0:65536] DEBUG:multipart.multipart:Calling on_part_data with data[0:76892] DEBUG:multipart.multipart:Calling on_part_data with data[0:70080] DEBUG:multipart.multipart:Calling on_part_data with data[0:47665] DEBUG:multipart.multipart:Calling on_part_end with no data DEBUG:multipart.multipart:Calling on_end with no data DEBUG:matplotlib.pyplot:Loaded backend tkagg version 8.6. DEBUG:matplotlib.pyplot:Loaded backend agg version v2.2. DEBUG:matplotlib.pyplot:Loaded backend tkagg version 8.6. Traceback (most recent call last): File "/home/fjh/Downloads/tldw/tldw/venv/lib/python3.11/site-packages/gradio/queueing.py", line 622, in process_events response = await route_utils.call_process_api( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/fjh/Downloads/tldw/tldw/venv/lib/python3.11/site-packages/gradio/route_utils.py", line 323, in call_process_api output = await app.get_blocks().process_api( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/fjh/Downloads/tldw/tldw/venv/lib/python3.11/site-packages/gradio/blocks.py", line 2012, in process_api inputs = await self.preprocess_data( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/fjh/Downloads/tldw/tldw/venv/lib/python3.11/site-packages/gradio/blocks.py", line 1711, in preprocess_data processed_input.append(block.preprocess(inputs_cached)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/fjh/Downloads/tldw/tldw/venv/lib/python3.11/site-packages/gradio/components/file.py", line 163, in preprocess return self._process_single_file(payload) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/fjh/Downloads/tldw/tldw/venv/lib/python3.11/site-packages/gradio/components/file.py", line 132, in _process_single_file raise Error( gradio.exceptions.Error: "Invalid file type. Please upload a file that is one of these formats: ['audio/*']"

rmusser01 commented 3 weeks ago

Does the file end in .mp3? It Is Finished Spoken Word.zip Try ingesting this mp3 file and let me know if you get the same error.

jcl2023 commented 3 weeks ago

Yes, my mp3 filename ends with .mp3.

Tried your mp3 file, It Is Finished Spoken Word.mp3 and got the same error below:

DEBUG:multipart.multipart:Calling on_part_begin with no data DEBUG:multipart.multipart:Calling on_header_field with data[42:61] DEBUG:multipart.multipart:Calling on_header_value with data[63:129] DEBUG:multipart.multipart:Calling on_header_end with no data DEBUG:multipart.multipart:Calling on_header_field with data[131:143] DEBUG:multipart.multipart:Calling on_header_value with data[145:155] DEBUG:multipart.multipart:Calling on_header_end with no data DEBUG:multipart.multipart:Calling on_headers_finished with no data DEBUG:multipart.multipart:Calling on_part_data with data[159:8760] DEBUG:multipart.multipart:Calling on_part_data with data[0:83220] DEBUG:multipart.multipart:Calling on_part_data with data[0:11680] DEBUG:multipart.multipart:Calling on_part_data with data[0:7300] DEBUG:multipart.multipart:Calling on_part_data with data[0:66020] DEBUG:multipart.multipart:Calling on_part_data with data[0:10220] DEBUG:multipart.multipart:Calling on_part_data with data[0:13140] DEBUG:multipart.multipart:Calling on_part_data with data[0:24820] DEBUG:multipart.multipart:Calling on_part_data with data[0:74132] DEBUG:multipart.multipart:Calling on_part_data with data[0:44448] DEBUG:multipart.multipart:Calling on_part_data with data[0:30984] DEBUG:multipart.multipart:Calling on_part_data with data[0:37960] DEBUG:multipart.multipart:Calling on_part_data with data[0:50448] DEBUG:multipart.multipart:Calling on_part_data with data[0:80300] DEBUG:multipart.multipart:Calling on_part_data with data[0:49640] DEBUG:multipart.multipart:Calling on_part_data with data[0:64240] DEBUG:multipart.multipart:Calling on_part_data with data[0:56940] DEBUG:multipart.multipart:Calling on_part_data with data[0:78840] DEBUG:multipart.multipart:Calling on_part_data with data[0:75920] DEBUG:multipart.multipart:Calling on_part_data with data[0:84180] DEBUG:multipart.multipart:Calling on_part_data with data[0:76568] DEBUG:multipart.multipart:Calling on_part_data with data[0:84680] DEBUG:multipart.multipart:Calling on_part_data with data[0:87432] DEBUG:multipart.multipart:Calling on_part_data with data[0:66996] DEBUG:multipart.multipart:Calling on_part_data with data[0:102684] DEBUG:multipart.multipart:Calling on_part_data with data[0:74620] DEBUG:multipart.multipart:Calling on_part_data with data[0:72836] DEBUG:multipart.multipart:Calling on_part_data with data[0:86624] DEBUG:multipart.multipart:Calling on_part_data with data[0:80136] DEBUG:multipart.multipart:Calling on_part_data with data[0:87760] DEBUG:multipart.multipart:Calling on_part_data with data[0:81920] DEBUG:multipart.multipart:Calling on_part_data with data[0:98792] DEBUG:multipart.multipart:Calling on_part_data with data[0:86140] DEBUG:multipart.multipart:Calling on_part_data with data[0:52560] DEBUG:multipart.multipart:Calling on_part_data with data[0:21900] DEBUG:multipart.multipart:Calling on_part_data with data[0:64240] DEBUG:multipart.multipart:Calling on_part_data with data[0:62780] DEBUG:multipart.multipart:Calling on_part_data with data[0:70080] DEBUG:multipart.multipart:Calling on_part_data with data[0:49640] DEBUG:multipart.multipart:Calling on_part_data with data[0:9496] DEBUG:multipart.multipart:Calling on_part_end with no data DEBUG:multipart.multipart:Calling on_end with no data DEBUG:matplotlib.pyplot:Loaded backend tkagg version 8.6. DEBUG:matplotlib.pyplot:Loaded backend agg version v2.2. DEBUG:matplotlib.pyplot:Loaded backend tkagg version 8.6. Traceback (most recent call last): File "/home/fjh/Downloads/tldw/tldw/venv/lib/python3.11/site-packages/gradio/queueing.py", line 622, in process_events response = await route_utils.call_process_api( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/fjh/Downloads/tldw/tldw/venv/lib/python3.11/site-packages/gradio/route_utils.py", line 323, in call_process_api output = await app.get_blocks().process_api( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/fjh/Downloads/tldw/tldw/venv/lib/python3.11/site-packages/gradio/blocks.py", line 2012, in process_api inputs = await self.preprocess_data( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/fjh/Downloads/tldw/tldw/venv/lib/python3.11/site-packages/gradio/blocks.py", line 1711, in preprocess_data processed_input.append(block.preprocess(inputs_cached)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/fjh/Downloads/tldw/tldw/venv/lib/python3.11/site-packages/gradio/components/file.py", line 163, in preprocess return self._process_single_file(payload) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/fjh/Downloads/tldw/tldw/venv/lib/python3.11/site-packages/gradio/components/file.py", line 132, in _process_single_file raise Error( gradio.exceptions.Error: "Invalid file type. Please upload a file that is one of these formats: ['audio/*']"

rmusser01 commented 3 weeks ago

Are you using the latest GitHub version from main and also is ‘ffmpeg.exe’ in the ‘\Bin’ folder?

jcl2023 commented 3 weeks ago

Yes, I use the latest GitHub version from main. I don't have ffmpeg.exe and Bin directory because my system is ubuntu

jcl2023 commented 3 weeks ago

Transcribing a youtube video is OK though

rmusser01 commented 2 weeks ago

An update to this, I am still unable to replicate this. I have downloaded and setup an xubuntu 24.04 VM, installed tldw, set it to run from CPU and while it definitely chokes when I run it from CPU, it does successfully process the sample mp3 I created(see screenshot). I did encounter an odd error when attempting to transcribe an ogg file, though that I believe is unrelated to the issue you're encountering. Capture

Can you please run the command cat /etc/*release and paste the results? Results from running it on my VM:

xubuntu@xubuntu-desktop:~/Working/tldw$ cat /etc/*release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=24.04
DISTRIB_CODENAME=noble
DISTRIB_DESCRIPTION="Ubuntu 24.04.1 LTS"
PRETTY_NAME="Ubuntu 24.04.1 LTS"
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04.1 LTS (Noble Numbat)"
VERSION_CODENAME=noble
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=noble
LOGO=ubuntu-logo
rmusser01 commented 6 days ago

@jcl2023 Following up on this, have you encountered this error since?