Open matthewpalmer opened 9 years ago
Another value for Item might have to be num_of_units (per purchase) to deal with woolworths annoying "2 for 1" specials (I would like to ignore them, but realistically that's where we'll get a lot of savings)
Item
Category
Matches ← will be one of the most active tables
Store
With indexes like
Item store, title, matchesWith, belongsTo
Category title
matches item_one_id, item_two_id
store name
Also might be a problem: I think django has a built-in ORM for mysql. We can either map our data importer to whatever the django ORM generates, or map django to our schema. I slightly prefer the latter, because it'll be more loosely coupled and means we can write our bulk update/inserter regardless of the application layer. It might mean there are a couple of extra hoops in app development though. Thoughts?
This is untested, but hopefully provides a more concrete schema
This is tested. Haven't actually added any data yet, but the db and tables are set up.
--
-- Create tables
--
-- Store
CREATE TABLE `store` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
`name` ENUM('coles', 'woolworths') NOT NULL,
`base_url` VARCHAR(500),
`created_at` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
);
-- Category
CREATE TABLE `category` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
`store` INT UNSIGNED NOT NULL,
`title` VARCHAR(200) NOT NULL,
`number_of_items` INT UNSIGNED NOT NULL,
`created_at` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
FOREIGN KEY(`store`) REFERENCES `store`(`id`)
);
-- Match
CREATE TABLE `match`(
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
`item_one` INT UNSIGNED,
`item_two` INT UNSIGNED,
`rating` FLOAT,
`created_at` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
);
-- Item
CREATE TABLE `item` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
`store` INT UNSIGNED NOT NULL,
`title` VARCHAR(500) NOT NULL,
`price_per_unit` FLOAT,
`price_per_kg` FLOAT,
`unit_size` INT UNSIGNED,
`units_per_perchase` INT UNSIGNED,
`url` VARCHAR(500),
`category` INT UNSIGNED,
`created_at` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
FOREIGN KEY(`store`) REFERENCES `store`(`id`),
FOREIGN KEY(`category`) REFERENCES `category`(`id`)
);
--
-- Create indexes
--
CREATE INDEX item_in_store
ON`item`(`store`);
CREATE INDEX item_with_title
ON `item`(`title`);
CREATE INDEX item_in_category
ON `item`(`category`);
CREATE UNIQUE INDEX category_with_title
ON `category`(`title`);
CREATE INDEX category_in_store
ON `category`(`store`);
CREATE INDEX match_item_one
ON `match`(`item_one`);
CREATE INDEX match_item_two
ON `match`(`item_two`);
CREATE UNIQUE INDEX match_both_unique
ON `match`(`item_one`, `item_two`);
CREATE UNIQUE INDEX store_with_name
ON `store`(`name`);
I agree with the latter, and that looks good.
Thinking mysql or postgres, but I don't really mind.
In terms of design, possibly something like (probably forgetting stuff here)
Item
Category
Matches ← will be one of the most active tables
Store
Notes