souzatharsis / podcastfy

An Open Source alternative to NotebookLM's podcast feature: Transforming Multimodal Content into Captivating Multilingual Audio Conversations with GenAI
https://www.podcastfy.ai
Apache License 2.0
848 stars 93 forks source link

Mac Run Prob: "Couldn't build proto file into descriptor pool: duplicate symbol 'google.ai.generativelanguage.v1beta.__firstlineno__ #85

Open RalphLosey opened 1 week ago

RalphLosey commented 1 week ago

Here is what ChatGPT says who has helped me to try everything to avoid the conflict:

The error continues to persist due to an internal conflict within the google-ai-generativelanguage library’s handling of protobuf files. This specific issue (duplicate symbol 'google.ai.generativelanguage.v1beta.firstlineno') is likely a bug or a deeper compatibility problem between versions of protobuf and Google’s API libraries.

At this stage, we’ve attempted multiple approaches without success.

If switching APIs is not an option and you need to continue using Google’s API, this issue might need to be addressed by the maintainers of either Podcastfy or Google’s google-ai-generativelanguage library. You can file a bug report with detailed error logs and your environment setup.

•   Report to Podcastfy: [Podcastfy GitHub Issues](https://github.com/souzatharsis/podcastfy/issues)
•   Report to Google Generative AI: [Google AI Language Issues](https://github.com/googleapis/python-generativelanguage/issues)

By providing detailed information about the protobuf error, this can help them patch the issue in future releases.

RalphLosey commented 1 week ago

GPT explains the error:

I encountered a persistent protobuf conflict when using the google-ai-generativelanguage library (v0.6.10) in combination with Podcastfy. The error is specifically related to a duplicate symbol in the protobuf descriptor pool, and it’s causing a crash with the following traceback: TypeError: Couldn't build proto file into descriptor pool: duplicate symbol 'google.ai.generativelanguage.v1beta.firstlineno'

This error occurs when running Podcastfy to generate audio content from a URL, and the issue seems to be related to how the langchain_google_genai package interacts with google-ai-generativelanguage.

Steps to Reproduce:

1.  Create a Python virtual environment.
2.  Install Podcastfy and google-ai-generativelanguage version 0.6.10.
3.  Attempt to generate a podcast using:

python -m podcastfy.client --url https://e-discoveryteam.com/tar-course/

  1. The traceback error is thrown, preventing any output generation.

Error Log:

Provide the full traceback here:

Traceback (most recent call last): File "", line 198, in _run_module_as_main ... TypeError: Couldn't build proto file into descriptor pool: duplicate symbol 'google.ai.generativelanguage.v1beta.firstlineno'

Expected Behavior:

The system should successfully process the URL content without protobuf conflicts.

Actual Behavior:

The process fails with a protobuf-related TypeError.

Additional Context:

•   Environment:
•   OS: macOS (10.x or higher)
•   Python version: 3.13
•   protobuf version: 4.21.6
•   google-ai-generativelanguage version: 0.6.10
•   langchain_google_genai version: 2.0.1
•   Podcastfy version: Latest stable release
•   Potential Cause:
•   There appears to be a conflict between google-ai-generativelanguage and the way it handles protobuf files, specifically with duplicate symbols. This occurs even in clean environments and across different versions of protobuf and google-ai-generativelanguage.

Troubleshooting Steps Taken:

1.  Tried downgrading google-ai-generativelanguage to earlier versions (0.5.x, 0.4.x) without success.
2.  Tried clearing protobuf caches and reinstalling.
3.  Created fresh virtual environments and installed only the necessary dependencies.

Would appreciate guidance on resolving the protobuf conflicts or patching the issue within the Google Generative AI library. Thank you!

raffaalmeida commented 1 week ago

I'm having the exact same error. Did you manage to fix it?

souzatharsis commented 1 week ago

No, can't replicate it on my Linux machine.

I will publish a Docker image to sort this out in the coming days.

On Wed, Oct 23, 2024, 12:35 PM Rafael Almeida @.***> wrote:

I'm having the exact same error. Did you manage to fix it?

— Reply to this email directly, view it on GitHub https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2432436381, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADTMY3KAO7JGTM6U7XDKNSDZ46X4TAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMZSGQZTMMZYGE . You are receiving this because you are subscribed to this thread.Message ID: @.***>

RalphLosey commented 4 days ago

Waiting for help? Anyone? No Mac user's?

souzatharsis commented 4 days ago

Hi, Would it be helpful if I published a Docker image to ensure reproducibility?

If yes, I can work on it right away.

Best Regards,

-- Thársis http://linkedin.com/in/tharsissouza

On Sat, Oct 26, 2024 at 2:09 PM RalphLosey @.***> wrote:

Waiting for help? Anyone? No Mac user's?

— Reply to this email directly, view it on GitHub https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2439658585, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADTMY3JH2TSWN2N5ZW766XTZ5PEC3AVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMZZGY2TQNJYGU . You are receiving this because you commented.Message ID: @.***>

souzatharsis commented 3 days ago

@RalphLosey Docker image has been created. Please see below and let me know if this issue can be closed.

https://github.com/souzatharsis/podcastfy/blob/main/usage/docker.md

RalphLosey commented 3 days ago

After hours of work, I finally got it to work. Learned a lot in the process, so not complaining. Never worked with Python or any of this and only got through it with ChatGPT4o. But perhaps I did something wrong because the MP3 generated sounded very robotic, almost no emotive. Not in same league as NotebookLM and nothing like your demos. Hear for yourself. What did I do wrong?

On Sun, Oct 27, 2024 at 2:21 PM Tharsis Souza @.***> wrote:

@RalphLosey https://github.com/RalphLosey Docker image has been created. Please see below and let me know if this issue can be closed.

https://github.com/souzatharsis/podcastfy/blob/main/usage/docker.md

— Reply to this email directly, view it on GitHub https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2440125950, or unsubscribe https://github.com/notifications/unsubscribe-auth/BMICR2NVX5R253IESN3YWQ3Z5UVLRAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBQGEZDKOJVGA . You are receiving this because you were mentioned.Message ID: @.***>

souzatharsis commented 3 days ago

Hi

  1. Could you please share what you did? It could be useful to Mac OS users

  2. The output audio is different because podcastfy now uses Microsoft's Edge model by default, which is free and does not require api key. You can use elevenlabs or Openai's model by setting the TTS model.

CLI:

1.

python -m podcastfy.client --url https://example.com/article1 --url https://example.com/article2 --tts-model elevenlabs

Package:

audio_file_multi = generate_podcast( urls=[urls], tts_model="elevenlabs")

On Sun, Oct 27, 2024, 11:04 PM RalphLosey @.***> wrote:

After hours of work, I finally got it to work. Learned a lot in the process, so not complaining. Never worked with Python or any of this and only got through it with ChatGPT4o. But perhaps I did something wrong because the MP3 generated sounded very robotic, almost no emotive. Not in same league as NotebookLM and nothing like your demos. Hear for yourself. What did I do wrong?

On Sun, Oct 27, 2024 at 2:21 PM Tharsis Souza @.***> wrote:

@RalphLosey https://github.com/RalphLosey Docker image has been created. Please see below and let me know if this issue can be closed.

https://github.com/souzatharsis/podcastfy/blob/main/usage/docker.md

— Reply to this email directly, view it on GitHub < https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2440125950>,

or unsubscribe < https://github.com/notifications/unsubscribe-auth/BMICR2NVX5R253IESN3YWQ3Z5UVLRAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBQGEZDKOJVGA>

. You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2440356194, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADTMY3JI37XVZ74NZODH7I3Z5WLSPAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBQGM2TMMJZGQ . You are receiving this because you commented.Message ID: @.***>

RalphLosey commented 2 days ago

So you are saying your demos use Eleven Labs and your app wont work in the same quality without using Eleven Labs? The Microsoft product, Microsoft Edge TTS, which is apparently your default, is very low quality as my output shows. Not even close to NotebookLM

On Sun, Oct 27, 2024 at 10:11 PM Tharsis Souza @.***> wrote:

Hi

  1. Could you please share what you did? It could be useful to Mac OS users

  2. The output audio is different because podcastfy now uses Microsoft's Edge model by default, which is free and does not require api key. You can use elevenlabs or Openai's model by setting the TTS model.

CLI:

1.

python -m podcastfy.client --url https://example.com/article1 --url https://example.com/article2 --tts-model elevenlabs

Package:

audio_file_multi = generate_podcast( urls=[urls], tts_model="elevenlabs")

On Sun, Oct 27, 2024, 11:04 PM RalphLosey @.***> wrote:

After hours of work, I finally got it to work. Learned a lot in the process, so not complaining. Never worked with Python or any of this and only got through it with ChatGPT4o. But perhaps I did something wrong because the MP3 generated sounded very robotic, almost no emotive. Not in same league as NotebookLM and nothing like your demos. Hear for yourself. What did I do wrong?

On Sun, Oct 27, 2024 at 2:21 PM Tharsis Souza @.***> wrote:

@RalphLosey https://github.com/RalphLosey Docker image has been created. Please see below and let me know if this issue can be closed.

https://github.com/souzatharsis/podcastfy/blob/main/usage/docker.md

— Reply to this email directly, view it on GitHub <

https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2440125950>,

or unsubscribe <

https://github.com/notifications/unsubscribe-auth/BMICR2NVX5R253IESN3YWQ3Z5UVLRAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBQGEZDKOJVGA>

. You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub < https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2440356194>,

or unsubscribe < https://github.com/notifications/unsubscribe-auth/ADTMY3JI37XVZ74NZODH7I3Z5WLSPAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBQGM2TMMJZGQ>

. You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2440374010, or unsubscribe https://github.com/notifications/unsubscribe-auth/BMICR2PYMO7IMAHQHQ7HH6DZ5WMNRAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBQGM3TIMBRGA . You are receiving this because you were mentioned.Message ID: @.***>

RalphLosey commented 2 days ago

There should be a way to integrate your app with Google Cloud TTS. Correct? Any instructions on that? I expected that would happen with your app and am confused by microsoft becoming part of the setup. I suppose part of tying to use Linuz environment on a mac.

On Mon, Oct 28, 2024 at 11:14 AM Ralph Losey @.***> wrote:

So you are saying your demos use Eleven Labs and your app wont work in the same quality without using Eleven Labs? The Microsoft product, Microsoft Edge TTS, which is apparently your default, is very low quality as my output shows. Not even close to NotebookLM

On Sun, Oct 27, 2024 at 10:11 PM Tharsis Souza @.***> wrote:

Hi

  1. Could you please share what you did? It could be useful to Mac OS users

  2. The output audio is different because podcastfy now uses Microsoft's Edge model by default, which is free and does not require api key. You can use elevenlabs or Openai's model by setting the TTS model.

CLI:

1.

python -m podcastfy.client --url https://example.com/article1 --url https://example.com/article2 --tts-model elevenlabs

Package:

audio_file_multi = generate_podcast( urls=[urls], tts_model="elevenlabs")

On Sun, Oct 27, 2024, 11:04 PM RalphLosey @.***> wrote:

After hours of work, I finally got it to work. Learned a lot in the process, so not complaining. Never worked with Python or any of this and only got through it with ChatGPT4o. But perhaps I did something wrong because the MP3 generated sounded very robotic, almost no emotive. Not in same league as NotebookLM and nothing like your demos. Hear for yourself. What did I do wrong?

On Sun, Oct 27, 2024 at 2:21 PM Tharsis Souza @.***> wrote:

@RalphLosey https://github.com/RalphLosey Docker image has been created. Please see below and let me know if this issue can be closed.

https://github.com/souzatharsis/podcastfy/blob/main/usage/docker.md

— Reply to this email directly, view it on GitHub <

https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2440125950>,

or unsubscribe <

https://github.com/notifications/unsubscribe-auth/BMICR2NVX5R253IESN3YWQ3Z5UVLRAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBQGEZDKOJVGA>

. You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub < https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2440356194>,

or unsubscribe < https://github.com/notifications/unsubscribe-auth/ADTMY3JI37XVZ74NZODH7I3Z5WLSPAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBQGM2TMMJZGQ>

. You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2440374010, or unsubscribe https://github.com/notifications/unsubscribe-auth/BMICR2PYMO7IMAHQHQ7HH6DZ5WMNRAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBQGM3TIMBRGA . You are receiving this because you were mentioned.Message ID: @.***>

souzatharsis commented 2 days ago

Hi, http://linkedin.com/in/tharsissouza

I do not understand the issue, could you please clarify?

  1. ElevenLabs can still be used. Simply pass "elevenlabs" to tts-model param as mentioned above.
  2. We changed the default model to edge because it is free. Anyone can try it without needing any key. Plus from a dev perspective we can test the product in an automated fashion (CI/CD) without incurring costs. Again, you can still use openai or elevenlabs by simply passing a param.
  3. There is a PR to add Google Cloud TTS. We will merge it soon. That is also not free but a potential value-add since the quality seems to be good.

Please let me know if that makes sense and happy to adjust the implementation if the current set up is causing issues.

On Mon, Oct 28, 2024 at 12:17 PM RalphLosey @.***> wrote:

There should be a way to integrate your app with Google Cloud TTS. Correct? Any instructions on that? I expected that would happen with your app and am confused by microsoft becoming part of the setup. I suppose part of tying to use Linuz environment on a mac.

On Mon, Oct 28, 2024 at 11:14 AM Ralph Losey @.***> wrote:

So you are saying your demos use Eleven Labs and your app wont work in the same quality without using Eleven Labs? The Microsoft product, Microsoft Edge TTS, which is apparently your default, is very low quality as my output shows. Not even close to NotebookLM

On Sun, Oct 27, 2024 at 10:11 PM Tharsis Souza @.***> wrote:

Hi

  1. Could you please share what you did? It could be useful to Mac OS users

  2. The output audio is different because podcastfy now uses Microsoft's Edge model by default, which is free and does not require api key. You can use elevenlabs or Openai's model by setting the TTS model.

CLI:

1.

python -m podcastfy.client --url https://example.com/article1 --url https://example.com/article2 --tts-model elevenlabs

Package:

audio_file_multi = generate_podcast( urls=[urls], tts_model="elevenlabs")

On Sun, Oct 27, 2024, 11:04 PM RalphLosey @.***> wrote:

After hours of work, I finally got it to work. Learned a lot in the process, so not complaining. Never worked with Python or any of this and only got through it with ChatGPT4o. But perhaps I did something wrong because the MP3 generated sounded very robotic, almost no emotive. Not in same league as NotebookLM and nothing like your demos. Hear for yourself. What did I do wrong?

On Sun, Oct 27, 2024 at 2:21 PM Tharsis Souza @.***> wrote:

@RalphLosey https://github.com/RalphLosey Docker image has been created. Please see below and let me know if this issue can be closed.

https://github.com/souzatharsis/podcastfy/blob/main/usage/docker.md

— Reply to this email directly, view it on GitHub <

https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2440125950>,

or unsubscribe <

https://github.com/notifications/unsubscribe-auth/BMICR2NVX5R253IESN3YWQ3Z5UVLRAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBQGEZDKOJVGA>

. You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub <

https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2440356194>,

or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ADTMY3JI37XVZ74NZODH7I3Z5WLSPAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBQGM2TMMJZGQ>

. You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub < https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2440374010>,

or unsubscribe < https://github.com/notifications/unsubscribe-auth/BMICR2PYMO7IMAHQHQ7HH6DZ5WMNRAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBQGM3TIMBRGA>

. You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2441882224, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADTMY3P3DTELY4UYDRJAVSLZ5ZIRLAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBRHA4DEMRSGQ . You are receiving this because you commented.Message ID: @.***>

RalphLosey commented 2 days ago

IMHO - changing to Edge harms your work. The output is too low quality. I guess I just assumed it was a pass to Gemini and thus instructions for the Google API. I dont care about the modest costs. I just want more control over WorkbookLM than Google now provides. I want a better AI Podcaster, not a lower grade one. Make sense?

On Mon, Oct 28, 2024 at 11:55 AM Tharsis Souza @.***> wrote:

Hi, http://linkedin.com/in/tharsissouza

I do not understand the issue, could you please clarify?

  1. ElevenLabs can still be used. Simply pass "elevenlabs" to tts-model param as mentioned above.
  2. We changed the default model to edge because it is free. Anyone can try it without needing any key. Plus from a dev perspective we can test the product in an automated fashion (CI/CD) without incurring costs. Again, you can still use openai or elevenlabs by simply passing a param.
  3. There is a PR to add Google Cloud TTS. We will merge it soon. That is also not free but a potential value-add since the quality seems to be good.

Please let me know if that makes sense and happy to adjust the implementation if the current set up is causing issues.

On Mon, Oct 28, 2024 at 12:17 PM RalphLosey @.***> wrote:

There should be a way to integrate your app with Google Cloud TTS. Correct? Any instructions on that? I expected that would happen with your app and am confused by microsoft becoming part of the setup. I suppose part of tying to use Linuz environment on a mac.

On Mon, Oct 28, 2024 at 11:14 AM Ralph Losey @.***> wrote:

So you are saying your demos use Eleven Labs and your app wont work in the same quality without using Eleven Labs? The Microsoft product, Microsoft Edge TTS, which is apparently your default, is very low quality as my output shows. Not even close to NotebookLM

On Sun, Oct 27, 2024 at 10:11 PM Tharsis Souza @.***> wrote:

Hi

  1. Could you please share what you did? It could be useful to Mac OS users

  2. The output audio is different because podcastfy now uses Microsoft's Edge model by default, which is free and does not require api key. You can use elevenlabs or Openai's model by setting the TTS model.

CLI:

1.

python -m podcastfy.client --url https://example.com/article1 --url https://example.com/article2 --tts-model elevenlabs

Package:

audio_file_multi = generate_podcast( urls=[urls], tts_model="elevenlabs")

On Sun, Oct 27, 2024, 11:04 PM RalphLosey @.***> wrote:

After hours of work, I finally got it to work. Learned a lot in the process, so not complaining. Never worked with Python or any of this and only got through it with ChatGPT4o. But perhaps I did something wrong because the MP3 generated sounded very robotic, almost no emotive. Not in same league as NotebookLM and nothing like your demos. Hear for yourself. What did I do wrong?

On Sun, Oct 27, 2024 at 2:21 PM Tharsis Souza @.***> wrote:

@RalphLosey https://github.com/RalphLosey Docker image has been created. Please see below and let me know if this issue can be closed.

https://github.com/souzatharsis/podcastfy/blob/main/usage/docker.md

— Reply to this email directly, view it on GitHub <

https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2440125950>,

or unsubscribe <

https://github.com/notifications/unsubscribe-auth/BMICR2NVX5R253IESN3YWQ3Z5UVLRAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBQGEZDKOJVGA>

. You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub <

https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2440356194>,

or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ADTMY3JI37XVZ74NZODH7I3Z5WLSPAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBQGM2TMMJZGQ>

. You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub <

https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2440374010>,

or unsubscribe <

https://github.com/notifications/unsubscribe-auth/BMICR2PYMO7IMAHQHQ7HH6DZ5WMNRAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBQGM3TIMBRGA>

. You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub < https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2441882224>,

or unsubscribe < https://github.com/notifications/unsubscribe-auth/ADTMY3P3DTELY4UYDRJAVSLZ5ZIRLAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBRHA4DEMRSGQ>

. You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2441980384, or unsubscribe https://github.com/notifications/unsubscribe-auth/BMICR2JYS2I4DBVEUEMSQZTZ5ZM5NAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBRHE4DAMZYGQ . You are receiving this because you were mentioned.Message ID: @.***>

souzatharsis commented 2 days ago

You can still select elevenlabs or openai TTS model by simply passing a param. Hence, there is no regression in capabilities. Would you agree?

(Thank you so much for the discussion, this is truly helpful to improve the product). http://linkedin.com/in/tharsissouza

On Mon, Oct 28, 2024 at 2:25 PM RalphLosey @.***> wrote:

IMHO - changing to Edge harms your work. The output is too low quality. I guess I just assumed it was a pass to Gemini and thus instructions for the Google API. I dont care about the modest costs. I just want more control over WorkbookLM than Google now provides. I want a better AI Podcaster, not a lower grade one. Make sense?

On Mon, Oct 28, 2024 at 11:55 AM Tharsis Souza @.***> wrote:

Hi, http://linkedin.com/in/tharsissouza

I do not understand the issue, could you please clarify?

  1. ElevenLabs can still be used. Simply pass "elevenlabs" to tts-model param as mentioned above.
  2. We changed the default model to edge because it is free. Anyone can try it without needing any key. Plus from a dev perspective we can test the product in an automated fashion (CI/CD) without incurring costs. Again, you can still use openai or elevenlabs by simply passing a param.
  3. There is a PR to add Google Cloud TTS. We will merge it soon. That is also not free but a potential value-add since the quality seems to be good.

Please let me know if that makes sense and happy to adjust the implementation if the current set up is causing issues.

On Mon, Oct 28, 2024 at 12:17 PM RalphLosey @.***> wrote:

There should be a way to integrate your app with Google Cloud TTS. Correct? Any instructions on that? I expected that would happen with your app and am confused by microsoft becoming part of the setup. I suppose part of tying to use Linuz environment on a mac.

On Mon, Oct 28, 2024 at 11:14 AM Ralph Losey @.***> wrote:

So you are saying your demos use Eleven Labs and your app wont work in the same quality without using Eleven Labs? The Microsoft product, Microsoft Edge TTS, which is apparently your default, is very low quality as my output shows. Not even close to NotebookLM

On Sun, Oct 27, 2024 at 10:11 PM Tharsis Souza @.***> wrote:

Hi

  1. Could you please share what you did? It could be useful to Mac OS users

  2. The output audio is different because podcastfy now uses Microsoft's Edge model by default, which is free and does not require api key. You can use elevenlabs or Openai's model by setting the TTS model.

CLI:

1.

python -m podcastfy.client --url https://example.com/article1 --url https://example.com/article2 --tts-model elevenlabs

Package:

audio_file_multi = generate_podcast( urls=[urls], tts_model="elevenlabs")

On Sun, Oct 27, 2024, 11:04 PM RalphLosey @.***> wrote:

After hours of work, I finally got it to work. Learned a lot in the process, so not complaining. Never worked with Python or any of this and only got through it with ChatGPT4o. But perhaps I did something wrong because the MP3 generated sounded very robotic, almost no emotive. Not in same league as NotebookLM and nothing like your demos. Hear for yourself. What did I do wrong?

On Sun, Oct 27, 2024 at 2:21 PM Tharsis Souza @.***> wrote:

@RalphLosey https://github.com/RalphLosey Docker image has been created. Please see below and let me know if this issue can be closed.

https://github.com/souzatharsis/podcastfy/blob/main/usage/docker.md

— Reply to this email directly, view it on GitHub <

https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2440125950>,

or unsubscribe <

https://github.com/notifications/unsubscribe-auth/BMICR2NVX5R253IESN3YWQ3Z5UVLRAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBQGEZDKOJVGA>

. You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub <

https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2440356194>,

or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ADTMY3JI37XVZ74NZODH7I3Z5WLSPAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBQGM2TMMJZGQ>

. You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub <

https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2440374010>,

or unsubscribe <

https://github.com/notifications/unsubscribe-auth/BMICR2PYMO7IMAHQHQ7HH6DZ5WMNRAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBQGM3TIMBRGA>

. You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub <

https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2441882224>,

or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ADTMY3P3DTELY4UYDRJAVSLZ5ZIRLAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBRHA4DEMRSGQ>

. You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub < https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2441980384>,

or unsubscribe < https://github.com/notifications/unsubscribe-auth/BMICR2JYS2I4DBVEUEMSQZTZ5ZM5NAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBRHE4DAMZYGQ>

. You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2442195450, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADTMY3OLVMXDDQU65GTAIHTZ5ZXQNAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBSGE4TKNBVGA . You are receiving this because you commented.Message ID: @.***>

RalphLosey commented 2 days ago

OpenAI only has text, not voice yet. So I would need to have another service Eleven. Should I? Does you app not have capacity to use Gemini linked NotebookLM? I thought it did. So, now I'm confused. All I know for sure os the MP3 audio it created with Microsoft had very poor robotic voices, totally unlike the demos and not nearly as good as free NotebookLM podcaster.

On Mon, Oct 28, 2024 at 1:29 PM Tharsis Souza @.***> wrote:

You can still select elevenlabs or openai TTS model by simply passing a param. Hence, there is no regression in capabilities. Would you agree?

(Thank you so much for the discussion, this is truly helpful to improve the product). http://linkedin.com/in/tharsissouza

On Mon, Oct 28, 2024 at 2:25 PM RalphLosey @.***> wrote:

IMHO - changing to Edge harms your work. The output is too low quality. I guess I just assumed it was a pass to Gemini and thus instructions for the Google API. I dont care about the modest costs. I just want more control over WorkbookLM than Google now provides. I want a better AI Podcaster, not a lower grade one. Make sense?

On Mon, Oct 28, 2024 at 11:55 AM Tharsis Souza @.***> wrote:

Hi, http://linkedin.com/in/tharsissouza

I do not understand the issue, could you please clarify?

  1. ElevenLabs can still be used. Simply pass "elevenlabs" to tts-model param as mentioned above.
  2. We changed the default model to edge because it is free. Anyone can try it without needing any key. Plus from a dev perspective we can test the product in an automated fashion (CI/CD) without incurring costs. Again, you can still use openai or elevenlabs by simply passing a param.
  3. There is a PR to add Google Cloud TTS. We will merge it soon. That is also not free but a potential value-add since the quality seems to be good.

Please let me know if that makes sense and happy to adjust the implementation if the current set up is causing issues.

On Mon, Oct 28, 2024 at 12:17 PM RalphLosey @.***> wrote:

There should be a way to integrate your app with Google Cloud TTS. Correct? Any instructions on that? I expected that would happen with your app and am confused by microsoft becoming part of the setup. I suppose part of tying to use Linuz environment on a mac.

On Mon, Oct 28, 2024 at 11:14 AM Ralph Losey @.***> wrote:

So you are saying your demos use Eleven Labs and your app wont work in the same quality without using Eleven Labs? The Microsoft product, Microsoft Edge TTS, which is apparently your default, is very low quality as my output shows. Not even close to NotebookLM

On Sun, Oct 27, 2024 at 10:11 PM Tharsis Souza @.***> wrote:

Hi

  1. Could you please share what you did? It could be useful to Mac OS users

  2. The output audio is different because podcastfy now uses Microsoft's Edge model by default, which is free and does not require api key. You can use elevenlabs or Openai's model by setting the TTS model.

CLI:

1.

python -m podcastfy.client --url https://example.com/article1 --url https://example.com/article2 --tts-model elevenlabs

Package:

audio_file_multi = generate_podcast( urls=[urls], tts_model="elevenlabs")

On Sun, Oct 27, 2024, 11:04 PM RalphLosey @.***> wrote:

After hours of work, I finally got it to work. Learned a lot in the process, so not complaining. Never worked with Python or any of this and only got through it with ChatGPT4o. But perhaps I did something wrong because the MP3 generated sounded very robotic, almost no emotive. Not in same league as NotebookLM and nothing like your demos. Hear for yourself. What did I do wrong?

On Sun, Oct 27, 2024 at 2:21 PM Tharsis Souza @.***> wrote:

@RalphLosey https://github.com/RalphLosey Docker image has been created. Please see below and let me know if this issue can be closed.

https://github.com/souzatharsis/podcastfy/blob/main/usage/docker.md

— Reply to this email directly, view it on GitHub <

https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2440125950>,

or unsubscribe <

https://github.com/notifications/unsubscribe-auth/BMICR2NVX5R253IESN3YWQ3Z5UVLRAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBQGEZDKOJVGA>

. You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub <

https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2440356194>,

or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ADTMY3JI37XVZ74NZODH7I3Z5WLSPAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBQGM2TMMJZGQ>

. You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub <

https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2440374010>,

or unsubscribe <

https://github.com/notifications/unsubscribe-auth/BMICR2PYMO7IMAHQHQ7HH6DZ5WMNRAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBQGM3TIMBRGA>

. You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub <

https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2441882224>,

or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ADTMY3P3DTELY4UYDRJAVSLZ5ZIRLAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBRHA4DEMRSGQ>

. You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub <

https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2441980384>,

or unsubscribe <

https://github.com/notifications/unsubscribe-auth/BMICR2JYS2I4DBVEUEMSQZTZ5ZM5NAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBRHE4DAMZYGQ>

. You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub < https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2442195450>,

or unsubscribe < https://github.com/notifications/unsubscribe-auth/ADTMY3OLVMXDDQU65GTAIHTZ5ZXQNAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBSGE4TKNBVGA>

. You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/souzatharsis/podcastfy/issues/85#issuecomment-2442205474, or unsubscribe https://github.com/notifications/unsubscribe-auth/BMICR2OTWRS6WKDAAAVLGU3Z5ZYBDAVCNFSM6AAAAABQHV2VS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBSGIYDKNBXGQ . You are receiving this because you were mentioned.Message ID: @.***>