Closed shubham1172 closed 4 years ago
@shubham1172 Thanks for coming up with the proposal and apologies for the delayed reply. The database diagram looks nice and simple. The modified_on
field was something that was missing in the last database diagram.
One point to note regarding the platform_metadata
table is that there can be multiple posts for an event in each platform so the name of the table should actually be post_metadata
. Also we need to differentiate between a post and a comment, so we need to introduce a type
field. An example would be listing a meetup event post (which can be updated), so that if there is one post and one comment for the meetup event we can clearly differentiate what type is it.
On thinking a bit more, we really don't need the platform_metadata
table if we are using the json data approach since when we will display it in the template we can write the deserializer in such a way that it shows a field only if that key exists. We can just move the platform_id
field to the posts table.
There can be multiple posts for each platform but the structure for each platform remains the same. platform_metadata
describes how the data for each platform is structured. post_metadata
would hint that the metadata is about each post, which is not really what we are doing. I guess the markdown isn't very clear in itself.
Each entry in platform_metadata
describes how a post for each platform is structured. Based on this, the values are serialized/deserialized and stored in the posts
table. Each post belongs to a platform (many-to-one) and is indicated by the FK platform_id
.
To further differentiate a type, we know that all the platforms don't support comments. So we can have parent pointers in the supported types (say Meetup, Facebook, etc.) to show that they're comments. The parent pointer would be null for posts.
We can also have the fields created_by
and modified_by
in our tables for auditing reasons.
A metadata table is needed to define a structure to our data. Let's say a case, we want to create a new tweet. How do we know what fields are we expecting? How do we sanitize if a user gives an extra key in the JSON or misses a required key? Let's not rely on the user to provide sanitized data all the time, instead let's create a concrete structure for each data type, which is platform dependent.
@shubham1172 Thanks for the explanation, that makes it clear now. So the platform_metadata
table acts like the validator for data stored in each post of a particular platform. It would be good to document this in the proposal.
Yes we should add created_by
and modified_by
in posts and event table.
For the parent child relationship between post and comment do you propose a new table for comments ?
I think we can do with a single table, since rest of the content type will be almost the same. The parent_id will differentiate between a comment and a post.
I think we can do with a single table, since rest of the content type will be almost the same. The parent_id will differentiate between a comment and a post.
Sounds good. Can you update the database diagrams with the created_by, modified_by and parent_id fields ?
@shubham1172 Shouldn't parent_id
be in posts table rather than in data ? I think it will be easier to filter posts directly from database that way instead of iterating over all posts.
Also please update the Entity Relationship diagram image with the updates.
Since parent_id
a platform-dependent attribute, it should ideally be a part of platform-data
. For filtering, it'll be definitely efficient if the parent_id is in the posts
table, but then we'd have to maintain NULL values even for the platforms which do not support it. What say?
I know it'll be a little costly to query platform data, but Postgres supports native querying/filtering of JSON data. You can see the documentation here https://www.postgresql.org/docs/9.5/datatype-json.html
EDIT: One more advantage of having parent_id
in the post table is that it'll support a FK. It's not possible in the jsonb type AFAIK.
I'll update the Entity-Relationship diagram once we figure this out.
This sounds like an experiment we need to do. For querying fields in the app we will be using the Django ORM so it has to support multiple DBs i.e. sqlite for dev and Postgres for prod. I read the Django documentation and found it has support for Postgres JSONfield. For Sqlite we may have to install django-mysql
module which supports JSONField for mysql database.
@shubham1172 Let's go with the present idea you proposed. Go ahead and Update the database image.
Nice this looks good. This is a well designed doc @shubham1172 . I have added reviewers to have another pair of eyes have a look.
Thanks @shubham1172 :)
Please go to the branch and see the markdown file - model.md. It shows how we can model our events and social media posts.