openai / openai-python

The official Python library for the OpenAI API
https://pypi.org/project/openai/
Apache License 2.0
22.04k stars 3.04k forks source link

Prompt only works for the first audio #300

Closed uopsdod closed 8 months ago

uopsdod commented 1 year ago

Describe the bug

Somehow, the prompt only works for my first prompt in python.

`for audio_file in sound_track: print('prompt_for_whisperer: ' + prompt_for_whisperer) transcript = openai.Audio.transcribe( "whisper-1", audio_file, response_format = 'srt', prompt = prompt_for_whisperer)

transcripts_ary.append(transcript)`

To Reproduce

I include the whole script below to reproduce it. I told whisperer to limit words within 20, it works for the first audio only.

Code snippets

#@title 安裝相關套件 (yt-dlp, openAI API, Pydub)
! pip install --upgrade pip
! pip install yt-dlp
! pip install openai
! pip install pydub
#@title 下載 youtube 影片 
import yt_dlp

 #@markdown ### Youtube 連結:
url = 'https://youtu.be/VZD5iLl0E_E' #@param {type:"string"} 

# 抓取影片標題
with yt_dlp.YoutubeDL() as ydl:
  info_dict = ydl.extract_info(url, download=False)
  video_title = info_dict.get('title', None)

filename = video_title

prompt_for_whisperer = "This is about %s." \
"Consider its title in your response: %s." \
"Some Keywords you should expect: %s." \
"Additional info: %s." \
"Additional info: %s." \
% \
('AWS Elastic Load Balancer' \
, filename \
, 'AZ, ASG, EC2, ELB, Target Group.' \
# the below one does not work for all
, 'Add one space before and after every English words, such as AZ, ALB, Target Group'\
, 'for every transcrtipt, limit it within 20 words, but do not cut sentence in the middel.'
)

# 設定選項
ydl_opts = {
    'format': 'bestaudio/best',
    'outtmpl': filename , 
    'postprocessors': [{
        'key': 'FFmpegExtractAudio',
        'preferredcodec': 'mp3',
        'preferredquality': '192',
    }],
}

# 建立 yt_dlp 下載器物件
with yt_dlp.YoutubeDL(ydl_opts) as ydl:
    ydl.download([url])
#@title 分割 youtube 影片 
from pydub import AudioSegment

#@markdown ### 分割檔案的長度(單位:毫秒):
segment_length_ms = 60000 #@param {type:"integer"} 
segment_length_s = segment_length_ms/1000

# 載入 MP3 音檔
sound = AudioSegment.from_file(f'{filename}.mp3', format='mp3')

sound_track = []
# 將音檔分割成多個檔案
for i, chunk in enumerate(sound[::segment_length_ms]):
    # 設定分割檔案的檔名
    chunk.export(f'output_{i}.mp3', format='mp3')
    audio_file = open(f'output_{i}.mp3', "rb")
    sound_track.append(audio_file)

import openai
#@markdown ### 填入 OpenAI API Secret Key:
openai.api_key = '' #@param {type:"string"}
openai.api_key = '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'

transcripts_ary = []
for audio_file in sound_track:
  print('prompt_for_whisperer: ' + prompt_for_whisperer)
  transcript = openai.Audio.transcribe(
      "whisper-1", 
      audio_file, 
      response_format = 'srt',
      prompt = prompt_for_whisperer)

  transcripts_ary.append(transcript)
# debug
for transcript in transcripts_ary:
  print('transcript: ' , transcript)
! pip install pysrt
import pysrt

# 轉成 subtitle 物件 
subtitles = []
for transcript in transcripts_ary:
  subtitle = pysrt.from_string(transcript)
  subtitles.append(subtitle)

# 處理最後時間超過問題 
for subtitle in subtitles:
  max_time = pysrt.SubRipTime(seconds = segment_length_s)
  for sub in subtitle:
    sub.start = sub.start if sub.start < max_time else max_time
    sub.end = sub.end if sub.end < max_time else max_time

# 處理字幕時間銜接問題
shift_time_s = 0
for subtitle in subtitles:
  shift_time = pysrt.SubRipTime(seconds = shift_time_s)
  print("shift_time: ", shift_time)
  for sub in subtitle:
    sub.start = sub.start + shift_time
    sub.end = sub.end + shift_time    
  shift_time_s = shift_time_s + segment_length_s

# 全部合體
subtitle_merged = pysrt.from_string('')
for subtitle in subtitles:
  subtitle_merged.extend(subtitle)

# 存成檔案
subtitle_merged.save(filename + '.srt')

OS

maxOS

Python version

Python v3.7

Library version

openai-python v0.26.4

rattrayalex commented 8 months ago

Closing as stale; if the issue persists on the latest version of the SDK, please let us know.