The key differences focus on adding topic_embedding, moving overall_score to a separate relationship table, structuring a more flexible, non-channel-specific topic model with channel-specific mappings, and updating keywords to include weights for each term.
Here's the revised schema including weights for topic_keywords:
To create a structured database format where topics are non-channel-specific but have relationships with channels where they are present, you can design your schema as follows:
Database Format for Topics and Embeddings
Topics Table
topic_id (Primary Key): Unique identifier for each topic.
topic_name: Descriptive name or label of the topic (e.g., "AI and Machine Learning").
topic_keywords: Array of representative keywords with their associated weights (e.g., [{"term": "AI", "weight": 0.35}, {"term": "neural networks", "weight": 0.28}, {"term": "deep learning", "weight": 0.22}]).
topic_embedding: Numeric vector stored as an array of floats (e.g., [0.12, -0.34, 0.56, ..., 0.78]).
created_at: Timestamp indicating when the topic was created.
updated_at: Timestamp indicating the last update to the topic.
The key differences focus on adding topic_embedding, moving overall_score to a separate relationship table, structuring a more flexible, non-channel-specific topic model with channel-specific mappings, and updating keywords to include weights for each term.
Here's the revised schema including weights for
topic_keywords
:To create a structured database format where topics are non-channel-specific but have relationships with channels where they are present, you can design your schema as follows:
Database Format for Topics and Embeddings
Topics Table
topic_id
(Primary Key): Unique identifier for each topic.topic_name
: Descriptive name or label of the topic (e.g., "AI and Machine Learning").topic_keywords
: Array of representative keywords with their associated weights (e.g.,[{"term": "AI", "weight": 0.35}, {"term": "neural networks", "weight": 0.28}, {"term": "deep learning", "weight": 0.22}]
).topic_embedding
: Numeric vector stored as an array of floats (e.g.,[0.12, -0.34, 0.56, ..., 0.78]
).created_at
: Timestamp indicating when the topic was created.updated_at
: Timestamp indicating the last update to the topic.Example Record:
Channel-Topic Relationships Table
relationship_id
(Primary Key): Unique identifier for the relationship.channel_id
: Identifier for the channel (foreign key referencing aChannels
table).topic_id
: Identifier for the topic (foreign key referencing theTopics
table).topic_score
: Overall score indicating the strength of the relationship between the topic and the channel.last_updated
: Timestamp for when the score or relationship was last updated.Example Record:
Explanation:
topic_embedding
field within theTopics
table.topic_keywords
now include both the keyword and its corresponding weight to indicate the importance within the topic.topic_score
.Benefits: