hackmcgill / hackerAPI

🐓 API for registration, live-site
https://api.mchacks.ca
MIT License
22 stars 8 forks source link

Discussion On Database/Schema Design #776

Closed brarsanmol closed 1 month ago

brarsanmol commented 2 years ago

Hi folks,

As suggested, we will move the discussion on database design for the re-factor here.

Nov 30th 2021: I currently do not have the time to write every issue with staying and going away from a schemaless design, I will update tomorrow.

cc: @krubenok

krubenok commented 2 years ago

@brarsanmol, can you write up what your proposed set of tables and columns would be in a SQL world? Before we discuss pro's and con's I think it be good to agree on the alternative.

pierreTklein commented 2 years ago

@krubenok spitballing here, but it'll be some expansion of the current schema (if not a copy of it). I'm pretty sure that the existing schema can be ported to SQL without much change.

That being said, maybe there's a better DB format that could give us dividends for performance.

krubenok commented 2 years ago

Well considering the current implementation is basically a SQL in a noSQL database, that would make sense, but I still think it'd be valuable to see them both side by side before we make a decision.

Personally, I still think the data that needs to be stored is better suited to noSQL style (done properly without all the artificial keys and different tables we have now). I'm pretty sure we could consolidate sponsors, hackers, and accounts into a single table and maybe a few more while keeping it very human readable and probably easier on the implementation and troubleshooting side (no hackerID/accountID/etc... mapping to maintain, just nest one inside the other).

brarsanmol commented 2 years ago

@brarsanmol, can you write up what your proposed set of tables and columns would be in a SQL world? Before we discuss pro's and con's I think it be good to agree on the alternative.

Hi,

Please accept my apologies for the delay on this.

Here is a ERD, I can give the ASCII table layout if needed aswell.

database

brarsanmol commented 2 years ago

Well considering the current implementation is basically a SQL in a noSQL database, that would make sense, but I still think it'd be valuable to see them both side by side before we make a decision.

Personally, I still think the data that needs to be stored is better suited to noSQL style (done properly without all the artificial keys and different tables we have now). I'm pretty sure we could consolidate sponsors, hackers, and accounts into a single table and maybe a few more while keeping it very human readable and probably easier on the implementation and troubleshooting side (no hackerID/accountID/etc... mapping to maintain, just nest one inside the other).

We definitely can consolidate it all into one table using Single Table Inheritence, but this has its disadvantage where every field has to be nullable in the database itself.

I have already done some work consolidating all the fields that don't carry extra data into one (see that the Staff, and Volunteer models have been deleted). They will just be checked through the account role.

There remains some work in ensuring that user's cannot set an account type of higher elevation through the regular post route so i'll create some controller that take the proper permissions for this.

Frankly my only concern about schema migrations that occur over time, so we have to ensure we have at least one person that has a strong understanding of SQL on the team. TypeORM has the capability to generate and handle migrations but it's always good to play on the safe side.

Edit: The current application schema is not currently 100% correct, the original layout was meant to support multiple applications using a one to many relationship in the hacker model but due to some bugs I had to temporarily keep the current system. This was meant to solve #650.

brarsanmol commented 2 years ago

Just curious, is the createdAt timestamp for the the password reset schema necessary? We use JWT's with expiry dates and I would presume that the verification process would handle an expired token and in our old codebase from what I understand the value was never used either.

pierreTklein commented 2 years ago

You would have to look at git blame to potentially figure out why we had the createdAt field.

Maybe it was to invalidate duplicate (older) reset tokens?

If you don't see any references to that field being used in the code, then feel free to get rid of it.

brarsanmol commented 2 years ago

You would have to look at git blame to potentially figure out why we had the createdAt field.

Maybe it was to invalidate duplicate (older) reset tokens?

If you don't see any references to that field being used in the code, then feel free to get rid of it.

Doesn't seem like it is being used for anything. I'll delete the field.