Open BlaiseItUp opened 3 years ago
My 2NF and 3NF led me to have a TCustID table which honeslty didn't seem very effective because it only contained custID, which was then used as a foreign key in the other tables. I feel like I should've keep CustID, FName, and LName in the same table DataModeling HW.xlsx
Data Modeling
Cheatsheet:
First Normal Form (1NF)
A table is supposed to be in first normal form if:
Table 1 - Shirt Info (Primary Key: Shirt_ID)
First Normal Form Tables
Table 1a - Design Info (Primary Key: Shirt_ID)
Table 1b - Size Info (Primary Key: Shirt_ID, Size)
Second Normal Form (2NF)
A table is supposed to be in second normal form if:
Table 2 - Customer Info (Primary Key: Cust_ID, Store_ID)
Second Normal Form Tables
Table 2a - Customer Data (Primary Key: Cust_ID)
Table 2b - Store Data (Primary Key: Store_ID)
Third Normal Form (3NF)
A table is supposed to be in third normal form if:
Table 3 - Movie Info (Primary Key: MID)
Second Normal Form Tables
Table 3a - Movie Rating (Primary Key: MID)
Table 3b - Genre Info (Primary Key: Store_ID)
Exercise
Duplication of data within a relational database should raise all sorts of red flags unless it was done purposefully and logically. For example, if the database for an application that sells merchandise saves the retail price in multiple tables, it is extremely easy for them to get out of synch when one location changes, but the others remain unaltered. Not only does this introduce instability, it wastes storage space as well. Additionally, the application code will become less maintainable as it must now know about multiple places to pull price information. Storing data in multiple places also increases the chance that we will miss something and not apply proper protection. This situation can be resolved by using normalization, a technique employed to reduce redundancy and inconsistency. When we organize data in this way, we follow certain rules, with each rule referred to as a normal form. There are three primary rules, or normalization rules that we can follow. If we follow the first rule, the database is said to be in the first normal form, abbreviated as 1NF. If both the first and second rules are followed simultaneously, it is said to be in the second form of normalization, or 2NF, 3NF follows the same pattern. There are 4NF and 5NF,
Table 1 - CustomersAndSalesRepsAndSales
Here, you can see that each record in the CustomersAndSalesRepsAndSales table contains a single unique customer, with each product purchased represented by a single column. As a customer purchases a new product, we must put the product ID into a new column. Since we only have two product columns, this means that a customer will be allowed to purchase at most two products before we have to modify the customer table. Obviously, this is not very scalable, and we will be missing tons of opportunities to sell more of our products to repeat customers. The rule for First Normal Form, or 1NF, states that there are no repeatable fields or groups of fields within a table and is sometimes called the ‘No Repeating Groups’ rule. To have a 1NF database in this example, we will need to create a new record for each purchase, instead of having to create a new column for each purchase. New columns require the table structure to be modified, which is a HUGE deal, while adding new rows is just part of everyday business and easy.
Second Normal Form, or 2NF, mandates that repeated data must be removed. In this case, we need to address the fact that the name of a customer appears in multiple rows. If we need to change the customer’s name, we would need to locate all rows in this table and update them. This is unnecessarily complex, so we need to implement the 2NF rule called ‘Eliminate Redundant Data’. The way to do this is to create a new table representing the relationship between the duplicated data and whatever is unique about each row in the original table. In our case, we need to extract all information that is the same for all purchases made by the same customer. This means the customer name and sales rep ID is moved to a new table. We are assuming that each customer has only one sales rep.
Third normal form, or 3NF, requires that any non-key value in a table that is not dependent on the primary key should be moved to its own table. In our example, the CustomersAndSalesReps table lists the sales rep ID for each customer, but the value in this column will be duplicated if a sales rep can handle more than one customer. This is true in our pretend company, so the ‘Eliminate Non-Key-Dependent Duplicate Data’ rule, or 3NF, requires us to create yet another new table representing sales reps.