pncnmnp / phoenix10.1

Creates personalized radio stations with your own radio jockey!
MIT License
113 stars 6 forks source link

Customizing "ads" #1

Open etsea117 opened 1 year ago

etsea117 commented 1 year ago

It would be wonderful to be able to customize the fictional ads. I could see this as being linked to a to-do list or reminders lists, allowing someone to advertise important things to themselves. Alternatively, it could be advertising recently learned facts for greater knowledge absorption.

pncnmnp commented 1 year ago

Hi @etsea117! I love the idea of advertising recently learned facts. Also, imagine if this is being done in a short podcast like fashion - with 2 voices. I have a feeling that p335 from coqui-ai's vits model might be a great voice. Although, we should decrease her speed - similar to what we are currently doing with p267's voice.

There is a way to extract facts from a document using Textacy. We need to pass an (entity, cue, fragment) triple for this.

Two years back, I wrote some code to do this, but it is not in a good shape, lol.

Also about adding to-do lists or reminders, I have a question for everyone - Do we want to take this project in a direction which is more like Google Assistant? Or should we keep it close to a radio-like behavior?

Honestly, I am a not sure what the right direction should be.

However, I have received some feedback about fictional ads - and it seems like putting them inside the schema.json would be a good idea. That way, anyone who does not want to listen to them can turn them off.

Would love to hear your opinions on this.

bornagainpenguin commented 1 year ago

A suggestion to consider also would be an ability to use scriptures between songs too for a religious themed broadcast. That could be something that people would like to have and take advantage of.

pncnmnp commented 1 year ago

@bornagainpenguin

With the recent commit (535a1836), you can specify the music-genre to be something like religious or god to get a religious themed broadcast.

For instance, something like the following in your ./data/schema.json:

["music-genre", [["religious", 2], ["god", 1]]],

Tested it locally:

>>> import radio
>>> radio.Recommend().playlist_by_genre("religious")
[('Mahalia Jackson', 'City Called Heaven'), ('Hypatia Lake', 'Joseph and the Divine Intervention of the Recreational Center'), ('Collin Raye', 'When You Say Your Prayers')]
>>> radio.Recommend().playlist_by_genre("religious")
[('Collin Raye', 'When You Say Your Prayers'), ('Sister Rosetta Tharpe', 'The Devil Has Thrown Him Down'), ('Joy Askew', 'Little Darling')]
>>> radio.Recommend().playlist_by_genre("god")
[('The Bronx Casket Co.', 'Savior'), ('Pattern is Movement', 'Elephant'), ('Film School', 'Dear Me')]

Edit: Here is a list of genres this software supports.

ology commented 1 year ago

@bornagainpenguin I can imagine this and other related DJ-spoken subjects with an extra ["topic", "religion"], or ["topic", "science"], that you could (theoretically) add to data/schema.json. Then it would grab sentences from data/gpt/topics.json. 🤔

pncnmnp commented 1 year ago

@ology Aren't these topics currently supported by ./data/genres.csv?

For instance,

["music-genre", [["religion", 2], ["science", 1]]],

in schema.json should insert two songs with the tag religion and one song with the tag science. I am trying to understand the need for a topic action here.

ology commented 1 year ago

Oops. I thought that "music-genre" meant "music only" - not talking. I will experiment before speculating this time!

pncnmnp commented 1 year ago

Oh, wait! My mistake. Are you saying that something like ["topic", "religion"] would recommend a relevant podcast clip (religious-themed) and not a religious song?

ology commented 1 year ago

Yes! Spoken word, not music. Podcast would be cool.

pncnmnp commented 1 year ago

I see! Music part is currently supported.

To be honest, recommending podcasts based on genres seems like a research problem.

Here's how we could possibly achieve it:

  1. Start with something like Spotify Podcast Dataset (this is not public dataset, we need to request access)
  2. If access is granted, we can use the metadata information like show_name, show_description, episode_name, episode_description to predict the genres of that podcast episode. Probably the dumbest way to do this would be to use GPT-3.5-turbo. If we assume 750 words per execution, this can cost somewhere around $200 for 100k podcasts episodes. So, we might have to explore something else here.
  3. Then, we need to map the genres to the RSS feeds. We can then use this to recommend podcasts based on the genres.

Maybe we should look at some other datasets as well, like 13,000+ Podcasts. It even has genre and podcast description. However, we need to figure out how to get the RSS feed from this.

Let me think about this problem for a while. I'm sure there's an efficient way to do this.