ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++
MIT License
34.3k stars 3.48k forks source link

After running for a period of time, repeatedly output the same sentence #1853

Open dfengpo opened 7 months ago

dfengpo commented 7 months ago

I use model large-v3 When After running for a period of time, repeatedly output the same sentence like this: 00:00:00->00:00:29:请不吝点赞 订阅 转发 打赏支持明镜与点点栏目 00:00:29->00:00:59:请不吝点赞 订阅 转发 打赏支持明镜与点点栏目 00:00:59->00:01:29:请不吝点赞 订阅 转发 打赏支持明镜与点点栏目 00:01:29->00:01:59:请不吝点赞 订阅 转发 打赏支持明镜与点点栏目 00:01:59->00:02:29:请不吝点赞 订阅 转发 打赏支持明镜与点点栏目 00:02:29->00:02:59:请不吝点赞 订阅 转发 打赏支持明镜与点点栏目 00:02:59->00:03:29:请不吝点赞 订阅 转发 打赏支持明镜与点点栏目 00:03:29->00:03:59:请不吝点赞 订阅 转发 打赏支持明镜与点点栏目 00:03:59->00:04:29:请不吝点赞 订阅 转发 打赏支持明镜与点点栏目 00:04:29->00:04:59:请不吝点赞 订阅 转发 打赏支持明镜与点点栏目 00:04:59->00:05:29:请不吝点赞 订阅 转发 打赏支持明镜与点点栏目 00:05:29->00:05:59:请不吝点赞 订阅 转发 打赏支持明镜与点点栏目 00:05:59->00:06:29:请不吝点赞 订阅 转发 打赏支持明镜与点点栏目 00:06:29->00:06:59:请不吝点赞 订阅 转发 打赏支持明镜与点点栏目 00:06:59->00:07:29:请不吝点赞 订阅 转发 打赏支持明镜与点点栏目

Are there any parameters that can be set?

bobqianic commented 7 months ago

Could you please send over the sample audio?

dfengpo commented 7 months ago

Could you please send over the sample audio?

I can't upload the MP3 file, I adjusted the prompt and it can be fully transcribed. It's just very unstable, and this situation can occur when transcribing continuously for a long time

ghost commented 6 months ago

I also encountered the same problem. From the main branch(b602819)compile, an audio that lasts more than 10 minutes, and after a certain amount of time, it will output the repeating content:

./main -m models/ggml-large-v3.bin -f ~/Movies/be-rich-books.wav -nt -otxt ~/Movies/
whisper_init_from_file_with_params_no_state: loading model from 'models/ggml-large-v3.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51866
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 1280
whisper_model_load: n_audio_head  = 20
whisper_model_load: n_audio_layer = 32
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 1280
whisper_model_load: n_text_head   = 20
whisper_model_load: n_text_layer  = 32
whisper_model_load: n_mels        = 128
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 5 (large v3)
whisper_model_load: adding 1609 extra tokens
whisper_model_load: n_langs       = 100
whisper_backend_init: using Metal backend
ggml_metal_init: allocating
ggml_metal_init: found device: Apple M1
ggml_metal_init: picking default device: Apple M1
ggml_metal_init: default.metallib not found, loading from source
ggml_metal_init: GGML_METAL_PATH_RESOURCES = nil
ggml_metal_init: loading '/Users/repo/ai-playgroud/whisper.cpp/ggml-metal.metal'
ggml_metal_init: GPU name:   Apple M1
ggml_metal_init: GPU family: MTLGPUFamilyApple7  (1007)
ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_init: GPU family: MTLGPUFamilyMetal3  (5001)
ggml_metal_init: simdgroup reduction support   = true
ggml_metal_init: simdgroup matrix mul. support = true
ggml_metal_init: hasUnifiedMemory              = true
ggml_metal_init: recommendedMaxWorkingSetSize  = 11453.25 MB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =  2951.02 MiB, ( 2952.89 / 10922.67)
whisper_model_load:    Metal total size =  3094.36 MB
whisper_model_load: model size    = 3094.36 MB
whisper_backend_init: using Metal backend
ggml_metal_init: allocating
ggml_metal_init: found device: Apple M1
ggml_metal_init: picking default device: Apple M1
ggml_metal_init: default.metallib not found, loading from source
ggml_metal_init: GGML_METAL_PATH_RESOURCES = nil
ggml_metal_init: loading '/Users/repo/ai-playgroud/whisper.cpp/ggml-metal.metal'
ggml_metal_init: GPU name:   Apple M1
ggml_metal_init: GPU family: MTLGPUFamilyApple7  (1007)
ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_init: GPU family: MTLGPUFamilyMetal3  (5001)
ggml_metal_init: simdgroup reduction support   = true
ggml_metal_init: simdgroup matrix mul. support = true
ggml_metal_init: hasUnifiedMemory              = true
ggml_metal_init: recommendedMaxWorkingSetSize  = 11453.25 MB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =   210.00 MiB, ( 3162.89 / 10922.67)
whisper_init_state: kv self size  =  220.20 MB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =   234.38 MiB, ( 3397.27 / 10922.67)
whisper_init_state: kv cross size =  245.76 MB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =    32.97 MiB, ( 3430.23 / 10922.67)
whisper_init_state: compute buffer (conv)   =   36.26 MB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =   889.44 MiB, ( 4319.67 / 10922.67)
whisper_init_state: compute buffer (encode) =  934.34 MB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =     7.33 MiB, ( 4327.00 / 10922.67)
whisper_init_state: compute buffer (cross)  =    9.38 MB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =   197.95 MiB, ( 4524.95 / 10922.67)
whisper_init_state: compute buffer (decode) =  209.26 MB

system_info: n_threads = 4 / 8 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | METAL = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | CUDA = 0 | COREML = 0 | OPENVINO = 0 |

main: processing '/Users/Movies/be-rich-books.wav' (18582651 samples, 1161.4 sec), 4 threads, 1 processors, 5 beams + best of 5, lang = en, task = transcribe, timestamps = 0 ...

 - So you wanna become a millionaire, and everyone is telling you to read books, but hardly anyone's got time for that. Luckily, over my many years in business, I've read hundreds of books, but there are 40 money-making books that have really stuck with me. So in this video, I'm gonna condense all of these books into the core lessons that will actually make you rich. And no, I'm not just saying this, I have experience doing it. I've gone from absolute zero to making tens of millions, and I didn't make my money on YouTube like most people on here. I did it running real businesses. So I promise that the things I'm gonna share with you today will take you from $0 to $100,000, 100,000 to a million, and a million to $10 million, if you apply them correctly. Level one, going from zero to $100,000. First, you need to master your mind. I didn't come from a wealthy background, and during my youth, I vividly remember people saying the phrase, money is the root of all evil. And the word money, in my mind, explains how these phrases program us to believe that rich people are the devil. It took me a long time to rid myself of the negative feeling in my stomach when I thought about making money. If I'd have read this book sooner, it would have helped me realize this issue earlier. Once I got over this hurdle, I had to ask myself, what is money actually for? For me, the answer was freedom. The psychology of money puts it a little differently and says it's all about control. But I suppose the goal is the same, whenever you want. It also touches on how there's a clear difference between the rich and the wealthy. Wealth is what you don't see. The car's not purchased, diamonds not bought, renovations postponed, clothes for gone, and first class upgrades declined. Unfortunately, many people don't even believe that they can become wealthy. I remember my school friends saying things like, I hope I can get an okay job when I graduate. They weren't even hoping for a good one. They were heading their finish line too soon and therefore never have a shot at being successful. While I don't believe that thinking big alone will make you successful, it's certainly the first step. I always struggled in school and felt like a failure. Interestingly, when I left and started working in the real world, I found that I kept winning. This was probably because I chose jobs I had an interest in rather than learning things I didn't care about at school. "The Winner Effect" talks about actual studies to back up this phenomenon. A positive correlation has been found between successful stock market traders and their testosterone levels. Furthermore, winning increases the testosterone receptors in your brain, which causes you to win more in the future. That's why it's important to set those achievable goals and get those little wins every day. I believe that my struggles in school and my strong drive for success had a positive impact on my life. They left me with no other option but to push forward. "Think and Grow Rich" tells the story about a great commander. He was often outnumbered by a far superior enemy. He burned his army's own ships, leaving them with no escape plan. This forced his soldiers to fight with everything they had, resulting in a miraculous victory against all the odds. It just shows there's a big difference between wanting to succeed and needing to. Being comfortable is just as harmful as having an escape plan. During a chat with Andrew Tate, he referred to this as one of the Matrix's tools. "Unscripted" also presents this idea. The author argues that society aims to transform us into model citizens, which stands for mediocre, obedient, dependent, entertained, and lifeless. Avoiding this fate depends on how well you can manage stress. The essence of success gives us a helpful perspective on this. The book mentions that all the water that forms a fog that's 100 feet deep and seven city blocks wide can fit into a small glass. Whenever I feel stressed, I like to imagine drinking this water and using it as fuel. Now you can develop great habits. Stress can also come from factors outside our control. For this reason, I've made a conscious effort throughout my life to only focus on things that I can directly impact. "Atomic Habits" reveals that many people fixate on the end goal, which they have no control over, rather than the process of achieving it, which is completely in their hands. I found the author's habit stacking technique to be particularly helpful for forming new habits. I'm not sure how many of you have experienced this, but I think it's a good idea to try it with an existing one, so both are completed at the same time. However, most people just keep putting things off and never form good habits. I know far too many people who just go through the motions without any direction, and before they know it, time has run out. "The Seven Habits of Highly Effective People" stresses the importance of always knowing where you're going and thinking about how you want to be remembered. I know this provides me with extra motivation, but this is only one part of the equation. "The Seven Habits of Highly Effective People" suggests treating a year as 12 weeks to create a sense of urgency. This really hit home for me, as I know I work best under pressure. Although short-term pain is often necessary for long-term gain, it can be difficult to remember this when faced with challenges. "The Art of Getting Things Done" emphasizes the importance of not relying on your brain to remember all your tasks. Instead, capture them on a phone or a notepad. This way, you can organize your tasks without getting them confused, and it's easy to remember how you spend your time. "Essentialism" reminds us that for every task we say yes to, we have to say no to many others. Therefore, it's important to value yes and not be afraid to say no. After adopting these habits, you should start building high-value skills. I've previously said that I don't like the advice, "Follow your passion." However, "So Good They Can't Ignore You" takes it one step further. The author interviewed many people who love their jobs. They have a passion for their work. Rather, as they improved, their love for their work grew. So what skills should you start building if you aren't necessarily following your passion? "The Unfair Advantage" suggests that to succeed in an unfair world, you need to play to your natural strengths, find what you're better at than most people, and lean into it. However, it's no longer enough to just possess high-value skills. Artificial intelligence is rapidly replacing many jobs. It's the same thing for various businesses to complete simple tasks. That's why the ideas shared in "Mastery" are more important than ever. My main takeaway was that you need to develop a unique stack of skills that is nearly impossible to replicate. "Still Like an Artist" echoes this point when it suggests to copy from multiple inspirations until you have a mix that is truly original. Having this unique skill stack will give you the ability to earn a high income. However, that still won't mean you can build wealth without understanding the basics. "The Unfair Advantage" suggests that to succeed in an unfair world, you need to play to your natural strengths, find what you're better at than most people, and lean into it. However, that still won't mean you can build wealth without understanding the basics. Having this unique skill stack will give you the ability to earn a high income. However, that still won't mean you can build wealth without understanding the basics. Having this unique skill stack will give you the ability to earn a high income. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics.
dfengpo commented 6 months ago

我也遇到了同样的问题。从主分支(b602819)编译,一个持续10分钟以上的音频,经过一定时间后,会输出重复的内容:

./main -m models/ggml-large-v3.bin -f ~/Movies/be-rich-books.wav -nt -otxt ~/Movies/
whisper_init_from_file_with_params_no_state: loading model from 'models/ggml-large-v3.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51866
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 1280
whisper_model_load: n_audio_head  = 20
whisper_model_load: n_audio_layer = 32
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 1280
whisper_model_load: n_text_head   = 20
whisper_model_load: n_text_layer  = 32
whisper_model_load: n_mels        = 128
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 5 (large v3)
whisper_model_load: adding 1609 extra tokens
whisper_model_load: n_langs       = 100
whisper_backend_init: using Metal backend
ggml_metal_init: allocating
ggml_metal_init: found device: Apple M1
ggml_metal_init: picking default device: Apple M1
ggml_metal_init: default.metallib not found, loading from source
ggml_metal_init: GGML_METAL_PATH_RESOURCES = nil
ggml_metal_init: loading '/Users/repo/ai-playgroud/whisper.cpp/ggml-metal.metal'
ggml_metal_init: GPU name:   Apple M1
ggml_metal_init: GPU family: MTLGPUFamilyApple7  (1007)
ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_init: GPU family: MTLGPUFamilyMetal3  (5001)
ggml_metal_init: simdgroup reduction support   = true
ggml_metal_init: simdgroup matrix mul. support = true
ggml_metal_init: hasUnifiedMemory              = true
ggml_metal_init: recommendedMaxWorkingSetSize  = 11453.25 MB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =  2951.02 MiB, ( 2952.89 / 10922.67)
whisper_model_load:    Metal total size =  3094.36 MB
whisper_model_load: model size    = 3094.36 MB
whisper_backend_init: using Metal backend
ggml_metal_init: allocating
ggml_metal_init: found device: Apple M1
ggml_metal_init: picking default device: Apple M1
ggml_metal_init: default.metallib not found, loading from source
ggml_metal_init: GGML_METAL_PATH_RESOURCES = nil
ggml_metal_init: loading '/Users/repo/ai-playgroud/whisper.cpp/ggml-metal.metal'
ggml_metal_init: GPU name:   Apple M1
ggml_metal_init: GPU family: MTLGPUFamilyApple7  (1007)
ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_init: GPU family: MTLGPUFamilyMetal3  (5001)
ggml_metal_init: simdgroup reduction support   = true
ggml_metal_init: simdgroup matrix mul. support = true
ggml_metal_init: hasUnifiedMemory              = true
ggml_metal_init: recommendedMaxWorkingSetSize  = 11453.25 MB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =   210.00 MiB, ( 3162.89 / 10922.67)
whisper_init_state: kv self size  =  220.20 MB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =   234.38 MiB, ( 3397.27 / 10922.67)
whisper_init_state: kv cross size =  245.76 MB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =    32.97 MiB, ( 3430.23 / 10922.67)
whisper_init_state: compute buffer (conv)   =   36.26 MB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =   889.44 MiB, ( 4319.67 / 10922.67)
whisper_init_state: compute buffer (encode) =  934.34 MB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =     7.33 MiB, ( 4327.00 / 10922.67)
whisper_init_state: compute buffer (cross)  =    9.38 MB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =   197.95 MiB, ( 4524.95 / 10922.67)
whisper_init_state: compute buffer (decode) =  209.26 MB

system_info: n_threads = 4 / 8 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | METAL = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | CUDA = 0 | COREML = 0 | OPENVINO = 0 |

main: processing '/Users/Movies/be-rich-books.wav' (18582651 samples, 1161.4 sec), 4 threads, 1 processors, 5 beams + best of 5, lang = en, task = transcribe, timestamps = 0 ...

 - So you wanna become a millionaire, and everyone is telling you to read books, but hardly anyone's got time for that. Luckily, over my many years in business, I've read hundreds of books, but there are 40 money-making books that have really stuck with me. So in this video, I'm gonna condense all of these books into the core lessons that will actually make you rich. And no, I'm not just saying this, I have experience doing it. I've gone from absolute zero to making tens of millions, and I didn't make my money on YouTube like most people on here. I did it running real businesses. So I promise that the things I'm gonna share with you today will take you from $0 to $100,000, 100,000 to a million, and a million to $10 million, if you apply them correctly. Level one, going from zero to $100,000. First, you need to master your mind. I didn't come from a wealthy background, and during my youth, I vividly remember people saying the phrase, money is the root of all evil. And the word money, in my mind, explains how these phrases program us to believe that rich people are the devil. It took me a long time to rid myself of the negative feeling in my stomach when I thought about making money. If I'd have read this book sooner, it would have helped me realize this issue earlier. Once I got over this hurdle, I had to ask myself, what is money actually for? For me, the answer was freedom. The psychology of money puts it a little differently and says it's all about control. But I suppose the goal is the same, whenever you want. It also touches on how there's a clear difference between the rich and the wealthy. Wealth is what you don't see. The car's not purchased, diamonds not bought, renovations postponed, clothes for gone, and first class upgrades declined. Unfortunately, many people don't even believe that they can become wealthy. I remember my school friends saying things like, I hope I can get an okay job when I graduate. They weren't even hoping for a good one. They were heading their finish line too soon and therefore never have a shot at being successful. While I don't believe that thinking big alone will make you successful, it's certainly the first step. I always struggled in school and felt like a failure. Interestingly, when I left and started working in the real world, I found that I kept winning. This was probably because I chose jobs I had an interest in rather than learning things I didn't care about at school. "The Winner Effect" talks about actual studies to back up this phenomenon. A positive correlation has been found between successful stock market traders and their testosterone levels. Furthermore, winning increases the testosterone receptors in your brain, which causes you to win more in the future. That's why it's important to set those achievable goals and get those little wins every day. I believe that my struggles in school and my strong drive for success had a positive impact on my life. They left me with no other option but to push forward. "Think and Grow Rich" tells the story about a great commander. He was often outnumbered by a far superior enemy. He burned his army's own ships, leaving them with no escape plan. This forced his soldiers to fight with everything they had, resulting in a miraculous victory against all the odds. It just shows there's a big difference between wanting to succeed and needing to. Being comfortable is just as harmful as having an escape plan. During a chat with Andrew Tate, he referred to this as one of the Matrix's tools. "Unscripted" also presents this idea. The author argues that society aims to transform us into model citizens, which stands for mediocre, obedient, dependent, entertained, and lifeless. Avoiding this fate depends on how well you can manage stress. The essence of success gives us a helpful perspective on this. The book mentions that all the water that forms a fog that's 100 feet deep and seven city blocks wide can fit into a small glass. Whenever I feel stressed, I like to imagine drinking this water and using it as fuel. Now you can develop great habits. Stress can also come from factors outside our control. For this reason, I've made a conscious effort throughout my life to only focus on things that I can directly impact. "Atomic Habits" reveals that many people fixate on the end goal, which they have no control over, rather than the process of achieving it, which is completely in their hands. I found the author's habit stacking technique to be particularly helpful for forming new habits. I'm not sure how many of you have experienced this, but I think it's a good idea to try it with an existing one, so both are completed at the same time. However, most people just keep putting things off and never form good habits. I know far too many people who just go through the motions without any direction, and before they know it, time has run out. "The Seven Habits of Highly Effective People" stresses the importance of always knowing where you're going and thinking about how you want to be remembered. I know this provides me with extra motivation, but this is only one part of the equation. "The Seven Habits of Highly Effective People" suggests treating a year as 12 weeks to create a sense of urgency. This really hit home for me, as I know I work best under pressure. Although short-term pain is often necessary for long-term gain, it can be difficult to remember this when faced with challenges. "The Art of Getting Things Done" emphasizes the importance of not relying on your brain to remember all your tasks. Instead, capture them on a phone or a notepad. This way, you can organize your tasks without getting them confused, and it's easy to remember how you spend your time. "Essentialism" reminds us that for every task we say yes to, we have to say no to many others. Therefore, it's important to value yes and not be afraid to say no. After adopting these habits, you should start building high-value skills. I've previously said that I don't like the advice, "Follow your passion." However, "So Good They Can't Ignore You" takes it one step further. The author interviewed many people who love their jobs. They have a passion for their work. Rather, as they improved, their love for their work grew. So what skills should you start building if you aren't necessarily following your passion? "The Unfair Advantage" suggests that to succeed in an unfair world, you need to play to your natural strengths, find what you're better at than most people, and lean into it. However, it's no longer enough to just possess high-value skills. Artificial intelligence is rapidly replacing many jobs. It's the same thing for various businesses to complete simple tasks. That's why the ideas shared in "Mastery" are more important than ever. My main takeaway was that you need to develop a unique stack of skills that is nearly impossible to replicate. "Still Like an Artist" echoes this point when it suggests to copy from multiple inspirations until you have a mix that is truly original. Having this unique skill stack will give you the ability to earn a high income. However, that still won't mean you can build wealth without understanding the basics. "The Unfair Advantage" suggests that to succeed in an unfair world, you need to play to your natural strengths, find what you're better at than most people, and lean into it. However, that still won't mean you can build wealth without understanding the basics. Having this unique skill stack will give you the ability to earn a high income. However, that still won't mean you can build wealth without understanding the basics. Having this unique skill stack will give you the ability to earn a high income. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. However, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics. Therefore, that still won't mean you can build wealth without understanding the basics.

Have all your problems been solved in the end? I haven't found a solution yet

dfengpo commented 6 months ago

Could you please send over the sample audio?

This is not a problem with the recording file. This problem can occur with any file at times

klaygomes commented 6 months ago

I had the same issue using large-v3. I'm attaching the file I was trying.

( https://filetransfer.io/data-package/3ZVHl1jw#link )

0x70b1a5 commented 6 months ago

I'm also having this issue with large-v3. The transcription seems to exhibit this error at different times for different runs over the same audio file, so I am not inclined to think it is an issue with the file itself.

dfengpo commented 6 months ago

I'm also having this issue with large-v3. The transcription seems to exhibit this error at different times for different runs over the same audio file, so I am not inclined to think it is an issue with the file itself.

Yes, I agree with your point of view. Have you found a solution yet?

The author can help to see what the problem is and if there is a way to solve it? @ggerganov

xochilpili commented 5 months ago

Facing the same issue... It's not the input file.

image

macklinhrw commented 5 months ago

I am getting the same issue as well. Interestingly, I also tested with the quantized large v3 and it works (but not highest quality output), and there is the same issue with the large v2 model. I have not tested large v1.

dfengpo commented 5 months ago

I am getting the same issue as well. Interestingly, I also tested with the quantized large v3 and it works (but not highest quality output), and there is the same issue with the large v2 model. I have not tested large v1.

can you share me the quantized large v3 model file?

macklinhrw commented 4 months ago

I am getting the same issue as well. Interestingly, I also tested with the quantized large v3 and it works (but not highest quality output), and there is the same issue with the large v2 model. I have not tested large v1.

can you share me the quantized large v3 model file?

When running the download script you can see the available models:

❯ ./models/download-ggml-model.sh
Usage: ./models/download-ggml-model.sh <model> [models_path]

Available models:
  tiny tiny.en tiny-q5_1 tiny.en-q5_1
  base base.en base-q5_1 base.en-q5_1
  small small.en small.en-tdrz small-q5_1 small.en-q5_1
  medium medium.en medium-q5_0 medium.en-q5_0
  large-v1 large-v2 large-v2-q5_0 large-v3 large-v3-q5_0

So you can run ./models/download-ggml-model.sh large-v3-q5_0 to download the 5-bit quantized large v3 model. I found I was getting better results from the medium model, though I'm not sure which is actually better.

d0rc commented 3 months ago

It happens on long audio, when using large-v3 models, not sure why.

brbrainerd commented 3 months ago

Wanted to add to this that the issue persists even after dividing long audio files into 30-second segments. It does prevent a single repetition from washing out the rest of the audio, however. The syntax I use is:

# Whisper CPP only accepts 16khz wav formatted files:
      ffmpeg -y -loglevel info -i "$file" -f segment -segment_time 30 -c:a pcm_s16le -ar 16000 "$tempDir/chunk_%03d.wav"

      # Initialize the final VTT file
      echo "WEBVTT" > "$outputVtt"

      # Process each chunk
      chunk_offset=0
      for chunk in "$tempDir"/chunk_*.wav; do
        chunkVtt="${chunk%.*}"
        ~/whisper.cpp/main -m "$modelFile" -ovtt -l "$langCode" -tr -f "$chunk" -of "$chunkVtt"
tacoe commented 1 month ago

Same. The repetitions start around the 7 minute mark for a file that was 56 minutes in total. Switching to medium 'solved' it.

The OpenAI whisper forum mentions this could be due to the default setting for compression_ratio_threshold which is at 2.4, and lowering it a bit seems to fix this issue. However, as far as I can see this setting isn't exposed in this version.

0x70b1a5 commented 1 month ago

@tacoe how much is "a bit" and how did you lower it? Just editing the source and rebuilding? 😅

tacoe commented 1 month ago

@0x70b1a5 You misunderstand (I admit my wording is confusing :). I didn't lower it: a post in a different repository (that this one was ported from) mentioned this approach works, but my point was this parameter isn't exposed in this version (the cpp-port).

bobqianic commented 1 month ago

It makes no sense to tweak hyperparameters. Checking the comprehension ratio is just a very rough fix using hardcoded rules. The model released by OpenAI doesn’t have any fine-tuning or PPO/DPO. It’s just a raw, large-scale pre-trained model.

bobqianic commented 1 month ago

Proximal Policy Optimization (PPO) - How to train Large Language Models https://www.youtube.com/watch?v=TjHH_--7l8g