chingu-voyage3 / bears-17

Prepare for Your Job Interview
2 stars 0 forks source link

Database Schema #40

Open zsoltime opened 6 years ago

zsoltime commented 6 years ago

How will the database schema look like? We're using Mongo, so we can embed documents (like the author in the questions collection) to reduce the number of queries.

I think the documents below are good starting points.

Users

[{
  "_id": "5a2e541bfc13ae28eb000072",
  "name": "Emalee Hinzer",
  "email": "ehinzer9@go.com",
  "avatar": "https://robohash.org/quisquamculpavoluptatem.jpg",
  "country": "Germany",
  "description": "Copper mug yr 3 wolf moon keytar cronut asymmetrical umami echo park kinfolk green juice bespoke vinyl disrupt. Flexitarian you probably have not heard of them chartreuse jean shorts farm-to-table literally.",
  "member_since": "2017-12-11T09:49:43.125Z"
}]

I think the user document is pretty self-explanatory. Maybe we could also add top_questions and top_answers fields with an array of 2-3 questions/answers to fetch everything in one single query and display these on the user's profile.

Questions

[{
  "_id": "5a2e573dfc13ae29fd000456",
  "submitted_at": "2017-12-11T10:09:48.245Z",
  "title": "You need a little dummy text for your mockup?",
  "body": "Try-hard tofu pop-up, meh deep v tilde hashtag pitchfork ramps normcore. Bushwick try-hard gentrify, single-origin coffee jean shorts man braid offal pop-up banh mi blog banjo.",
  "votes": 5,
  "author": {
    "_id": "5a2e541bfc13ae28eb000072",
    "name": "Emalee Hinzer",
    "avatar": "https://robohash.org/quisquamculpavoluptatem.jpg"
  },
  "answers": [{
    "$id": "5a2e573dfc13ae29fd000457",
    "submitted_at": "2017-12-11T10:25:11.114Z",
    "answer": "Hella semiotics trust fund roof party paleo marfa before they sold out tofu gastropub affogato taxidermy sriracha. Fashion axe lomo sustainable glossier cliche flexitarian post-ironic kickstarter hashtag taiyaki schlitz bushwick lumbersexual stumptown man braid",
    "votes": 21,
    "author": {
      "_id": "5a2e573dfc13ae29fd000458",
      "name": "King Reavey",
      "avatar": "https://robohash.org/etvoluptatemenim.jpg"
    }
  }]
}]

The questions documents are a bit more complex. I embedded the author, so we can fetch their name and avatar in one query. Not sure how the profile links will look like, but we can start with /profile/{user-mongo-id} and add slugs later (/profile/{user-shortid}). In that case we'll add a slug field :)

Answers are also embedded at the moment. We could also just create an answers collection with all the answers and only embed the top or latest answers to each question, but not sure if it has any performance gains :)

faityworld commented 6 years ago

What about to use author's id to fetch his/her information in questions and answers? So we'll have an authors collection and posts collection which will contain authors id's. If we follow it, we will not write a new document for users who are already posted something.

Heyjp commented 6 years ago

Well it might be looking too far in the future but what about extra resources that we were thinking about. Will we just be having text answers or supporting content?

Is this a better format than say having a separate collection just for answers, and then linking them by questionId. But I'd be happy to have a schema like this. Do we need total votes so we can track top contributors or is that also something for a later date?

zsoltime commented 6 years ago

@Heyjp I hope that's not too far :) I think we could stick to text content (with markdown support, maybe?) first.

I'm also thinking about adding an answers collection now but I'd still keep the author's id, name and avatar to reduce the number of queries. I'll open another PR where we'll have this answers collection.

I'm not sure how we should store the votes as we need to track who voted already.

@OStefani MongoDB is designed to store denormalised data and doesn't support JOINs like SQL does. If you need data from more than one collection, you need to perform more queries. Embedding the most frequently read user data fields in the questions and answers collections will save us extra queries.