TomFrankly / pipedream-notion-voice-notes

Take notes with your voice and send them to Notion
103 stars 55 forks source link

[Suggestion] Improve error message for short empty audio clips #59

Open tdnzr opened 9 months ago

tdnzr commented 9 months ago

Describe the bug Uploading a tiny, short, textless sound file (e.g. <4 sec) causes an unclear Pipedream error. (See logs below.)

I strongly suspect the error occurs because at some point (maybe in the API response by Whisper, maybe in the response by ChatGPT) the data becomes null or empty, and the automation code currently doesn't explicitly handle that error. For example, the logs below mention that "title" is "undefined".

Desired behavior: Have the error be clearer. For example: "Error: sound file too short"; "Error: sound file Name of length Length contained no text"; or even marking the workflow as "finished" without producing a transcript.

How to reproduce The type of audio file doesn't seem to matter, as long as it's very short and contains no text. So it might be easier to reproduce this error yourself than combing through my logs below.

Which cloud storage app are you using? (Google Drive, Dropbox, or OneDrive) Google Drive

Have you tried updating your workflow? The workflow is up-to-date as of today.

Does the issue only happen while testing the workflow, or does it happen during normal, automated runs? The issue happens during normal, automated runs.

Please paste the contents of your Logs tab from the notion_voice_notes action step. Email from Pipedream:

[pipedream] Error in workflow Notion Voice Notes (Google Drive).. Your workflow generated the following error at... TypeError — Cannot read properties of undefined (reading 'model')

Error message at the top of the Pipedream event page for the file that caused the error:

TypeError
Cannot read properties of undefined (reading 'model') 
    at Object.run (file:///pipedream/dist/code/f124a850c3f703097b5494bd1efc77777c2efa4db1b6462c7952f47e82001a2f/code/Notion-Voice-Notes.mjs:2076:24)
    at null.executeComponent (/var/task/launch_worker.js:267:22)
    at MessagePort.messageHandler (/var/task/launch_worker.js:764:28)

(notion_voice_notes)
details
Message: Cannot read properties of undefined (reading 'model')
Total Duration: 11,403 ms
Compute Time: 11,088 ms
Execution Start: 2023-12-31T08:43:37.706Z
Execution End: 2023-12-31T08:43:49.109Z
Steps Executed: 2 / 2
credits: 2
Version: 63 (d_xRsagPy1)

Full Logs:

12/31/2023, 9:43:47 AM

Checking that file is under 300mb...
12/31/2023, 9:43:47 AM

File size is approximately 0.1mb.
12/31/2023, 9:43:47 AM

File is under the size limit. Continuing...
12/31/2023, 9:43:47 AM

Checking if the user set languages...
12/31/2023, 9:43:47 AM

No language set. Whisper will attempt to detect the language.
12/31/2023, 9:43:47 AM

Successfully got duration: 2 seconds
12/31/2023, 9:43:47 AM

Chunking file: /tmp/2023-12-31_Recording_69.mp3
12/31/2023, 9:43:47 AM

Full file size: 0.08081340789794922mb. Chunk size: 24mb. Expected number of chunks: 1. Commencing chunking...
12/31/2023, 9:43:47 AM

Created 1 chunk: /tmp/chunks-2aIhtpRrptcyO062ZIRqywY87U3/chunk-000.mp3
12/31/2023, 9:43:47 AM

Chunks created successfully. Transcribing chunks: chunk-000.mp3
12/31/2023, 9:43:47 AM

Transcribing file: chunk-000.mp3
12/31/2023, 9:43:49 AM

Received response from OpenAI Whisper endpoint for chunk-000.mp3. Your API key's current Audio endpoing limits (learn more at https://platform.openai.com/docs/guides/rate-limits/overview):
12/31/2023, 9:43:49 AM

┌────────────────────────┬─────────┐ │ (index) │ Values │ ├────────────────────────┼─────────┤ │ requestRate │ '100' │ │ tokenRate │ null │ │ remainingRequests │ '99' │ │ remainingTokens │ null │ │ rateResetTimeRemaining │ '600ms' │ │ tokenRestTimeRemaining │ null │ └────────────────────────┴─────────┘
12/31/2023, 9:43:49 AM

Whisper chunks array:
12/31/2023, 9:43:49 AM

[ { data: { text: '' }, response: Response { size: 0, timeout: 0, [Symbol(Body internals)]: { body: PassThrough { _readableState: ReadableState { objectMode: false, highWaterMark: 16384, buffer: BufferList { head: null, tail: null, length: 0 }, length: 0, pipes: [], flowing: true, ended: true, endEmitted: true, reading: false, constructed: true, sync: false, needReadable: false, emittedReadable: false, readableListening: false, resumeScheduled: false, errorEmitted: false, emitClose: true, autoDestroy: true, destroyed: true, errored: null, closed: true, closeEmitted: true, defaultEncoding: 'utf8', awaitDrainWriters: null, multiAwaitDrain: false, readingMore: false, dataEmitted: true, decoder: null, encoding: null, [Symbol(kPaused)]: false }, _events: [Object: null prototype] { prefinish: [Function: prefinish], error: [ [Function (anonymous)], [Function (anonymous)] ], data: [Function (anonymous)], end: [Function (anonymous)] }, _eventsCount: 4, _maxListeners: undefined, _writableState: WritableState { objectMode: false, highWaterMark: 16384, finalCalled: true, needDrain: false, ending: true, ended: true, finished: true, destroyed: true, decodeStrings: true, defaultEncoding: 'utf8', length: 0, writing: false, corked: 0, sync: false, bufferProcessing: false, onwrite: [Function: bound onwrite], writecb: null, writelen: 0, afterWriteTickInfo: null, buffered: [], bufferedIndex: 0, allBuffers: true, allNoop: true, pendingcb: 0, constructed: true, prefinished: true, errorEmitted: false, emitClose: true, autoDestroy: true, errored: null, closed: true, closeEmitted: true, [Symbol(kOnFinished)]: [] }, allowHalfOpen: true, [Symbol(kCapture)]: false, [Symbol(kCallback)]: null }, disturbed: true, error: null }, [Symbol(Response internals)]: { url: 'https://api.openai.com/v1/audio/transcriptions', status: 200, statusText: 'OK', headers: Headers { [Symbol(map)]: [Object: null prototype] { date: [ 'Sun, 31 Dec 2023 08:43:48 GMT' ], 'content-type': [ 'application/json' ], 'content-length': [ '16' ], connection: [ 'keep-alive' ], 'openai-organization': [ 'user-xvebicyx1rieg1puupsjgc1i' ], 'openai-processing-ms': [ '1316' ], 'openai-version': [ '2020-10-01' ], 'strict-transport-security': [ 'max-age=15724800; includeSubDomains' ], 'x-ratelimit-limit-requests': [ '100' ], 'x-ratelimit-remaining-requests': [ '99' ], 'x-ratelimit-reset-requests': [ '600ms' ], 'x-request-id': [ 'ca2ed2543179f18b96a0b4d9f6b8e70d' ], 'cf-cache-status': [ 'DYNAMIC' ], 'set-cookie': [ '__cf_bm=MPexEIlSMUN1COf3IJPrOAN1NyUpsXsP4x.udm5G7sA-1704012228-1-AU9j6Ih7Vtkyj6GlOSrR3xUE/Bm8yhNFkHkvcBy/TGtlhNUyjr0R6UVL/nHPFq/6TOg5a2fCgy6pNFix/h8kKiA=; path=/; expires=Sun, 31-Dec-23 09:13:48 GMT; domain=.api.openai.com; HttpOnly; Secure; SameSite=None', '_cfuvid=mY7sbfcQVPKceyC2E9uP3jZAqiX3kD5j.hrUX.OoP44-1704012228861-0-604800000; path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None' ], server: [ 'cloudflare' ], 'cf-ray': [ '83e13ca458235af9-IAD' ], 'alt-svc': [ 'h3=":443"; ma=86400' ] } }, counter: 0 } } } ]
12/31/2023, 9:43:49 AM

Attempting to clean up the /tmp/ directory...
12/31/2023, 9:43:49 AM

Cleaning up /tmp/chunks-2aIhtpRrptcyO062ZIRqywY87U3...
12/31/2023, 9:43:49 AM

Using the gpt-3.5-turbo model.
12/31/2023, 9:43:49 AM

Max tokens per summary chunk: 2750
12/31/2023, 9:43:49 AM

Combining 1 transcript chunks into a single transcript...
12/31/2023, 9:43:49 AM

Transcript combined successfully.
12/31/2023, 9:43:49 AM

Longest period gap info: { "longestGap": -1, "longestGapText": "No period found" }
12/31/2023, 9:43:49 AM

Initiating moderation check on the transcript.
12/31/2023, 9:43:49 AM

Detected language with franc library: und
12/31/2023, 9:43:49 AM

Detected language is Chinese or undetermined, splitting by punctuation...
12/31/2023, 9:43:49 AM

Converting the transcript to paragraphs...
12/31/2023, 9:43:49 AM

Number of sentences before paragraph grouping: 0
12/31/2023, 9:43:49 AM

Number of paragraphs after grouping: 0
12/31/2023, 9:43:49 AM

Limiting paragraphs to 1800 characters...
12/31/2023, 9:43:49 AM

Transcript split into 0 chunks. Moderation check is most accurate on chunks of 2,000 characters or less. Moderation check will be performed on each chunk.
12/31/2023, 9:43:49 AM

Moderation check completed successfully. No abusive content detected.
12/31/2023, 9:43:49 AM

Full transcript is 0 tokens. If you run into rate-limit errors and are currently using free trial credit from OpenAI, please note the Tokens Per Minute (TPM) limits: https://platform.openai.com/docs/guides/rate-limits/what-are-the-rate-limits-for-our-api
12/31/2023, 9:43:49 AM

Splitting transcript into chunks of 2750 tokens...
12/31/2023, 9:43:49 AM

Split transcript into 0 chunks.
12/31/2023, 9:43:49 AM

Sending 0 chunks to ChatGPT...
12/31/2023, 9:43:49 AM

Summary array from ChatGPT:
12/31/2023, 9:43:49 AM

[]
12/31/2023, 9:43:49 AM

Formatting the ChatGPT results...
12/31/2023, 9:43:49 AM

ChatResponse object after ChatGPT items have been inserted:
12/31/2023, 9:43:49 AM

{ title: undefined, sentiment: undefined, summary: [], main_points: [], action_items: [], stories: [], references: [], arguments: [], follow_up: [], related_topics: [], usageArray: [] }
12/31/2023, 9:43:49 AM

Filtering Related Topics, if any exist:
12/31/2023, 9:43:49 AM

Final ChatResponse object:
12/31/2023, 9:43:49 AM

{ title: undefined, summary: '', main_points: [], action_items: [], stories: [], references: [], arguments: [], follow_up: [], tokens: 0 }
12/31/2023, 9:43:49 AM

Detected language with franc library: und
12/31/2023, 9:43:49 AM

Detected language is Chinese or undetermined, splitting by punctuation...
12/31/2023, 9:43:49 AM

Converting the transcript to paragraphs...
12/31/2023, 9:43:49 AM

Number of sentences before paragraph grouping: 0
12/31/2023, 9:43:49 AM

Number of paragraphs after grouping: 0
12/31/2023, 9:43:49 AM

Limiting paragraphs to 1200 characters...
12/31/2023, 9:43:49 AM

Detected language with franc library: und
12/31/2023, 9:43:49 AM

Detected language is Chinese or undetermined, splitting by punctuation...
12/31/2023, 9:43:49 AM

Converting the transcript to paragraphs...
12/31/2023, 9:43:49 AM

Number of sentences before paragraph grouping: 0
12/31/2023, 9:43:49 AM

Number of paragraphs after grouping: 0
12/31/2023, 9:43:49 AM

Limiting paragraphs to 1200 characters...
12/31/2023, 9:43:49 AM

Calculating the cost of the transcript...
12/31/2023, 9:43:49 AM

Transcript cost: $0.000
12/31/2023, 9:43:49 AM

Total tokens used in the summary process: 0 prompt tokens and 0 completion tokens.