Open nzomkxia opened 5 months ago
This is a known issue, the eng team is working on improving this.
Have you finished this issue?
I'm experiencing a similar problem here. Google gemini-pro-1.5 fails to transcribe the entire file (just transcribes the first 9 minutes of 73 minutes):
#genai.configure(api_key=os.environ["API_KEY"])
genai.configure(api_key=os.environ["GEMINI_API_KEY"])
# Initialize a Gemini model appropriate for your use case.
model = genai.GenerativeModel('models/gemini-1.5-pro-002')
media = pathlib.Path(__file__).parents[0] / "audio_files"
print(f"{media=}")
print("uploading file")
myfile = genai.upload_file(media / "simon_willison.mp3", mime_type="audio/mpeg")
print(f"{myfile=}")
stop = time.time()
elapsed = stop - start
print(f"Time to upload file: {elapsed:.2f} seconds")
#Time to upload file: 37.15 seconds
start = time.time()
fout = open ("simon_willison_transcript.txt", "w")
model = genai.GenerativeModel("gemini-1.5-flash")
model.generation_config = {
"temperature": 0.5,
"top_p": 0.95,
"top_k": 40,
"max_output_tokens": 500000,
"response_mime_type": "text/plain",
"audio_timestamp": True,
}
Description of the bug:
result=glm.GenerateContentResponse({'candidates': [{'finish_reason': 4, 'index': 0, 'safety_ratings': [], 'token_count': 0, 'grounding_attributions': []}]}),
Actual vs expected behavior:
transcribe the audio correctly
Any other information you'd like to share?
model: gemini-1.5-pro-latest audio length: 53 min audio format: mp3 audio file size: 13m