Closed mplewis closed 10 years ago
Psst. This closes #143.
Big Big Question: now that we are moving to a DB, are we still going to cache on our side? I think if we're the only client of the DB then it's okay, but we would need to be upfront about THIS WEBSITE IS HOSTED FROM ONE APP ONLY OKAY??.
I bet we'd see performance boost (aka stay good), but we should be sure it doesn't make consistency ugly.
I think caching is good. How do you feel about a cron job to recache every 60 seconds? Or 300, etc.
Like a timer inside the app? Personally, I'd rather hit it every time or guarantee in-memory and in-db consistency
How about: Since we're using in-app editors for DB data now, just hit an internal recache link whenever data is updated?
Bingo
I honestly still haven't been through this yet. Update to the control panel, etc, coming soon?
Wait, this isn't the control panel PR is it? Just the rest of the data and config stuff (which might need to become config.json, I think we mentioned?)
Cool. I'll maybe test soon if I need break from Fido.
Correct on all counts!
This code and the dump I have are out of sync to a point that's annoying to fix. Can I get a recent dump?
Added some commits. Please review and :sushi: or don't.
Tested and works on my machine.
Remind me of the current state of config.py
please?
Current (as of this second) Postgres dump is available at http://z.kesdev.com/code/1C1547083147
I like everything but the stuttery id
s! Not a fan of leader_id
in the leader table, can just be id
.
Do you also have the blank database generator, your plain text hand written db setup SQL, pushed somewhere?
On Mon, Oct 28, 2013 at 10:56 AM, mplewis notifications@github.com wrote:
Current (as of this second) Postgres dump is available at http://z.kesdev.com/code/1C1547083147
Reply to this email directly or view it on GitHub: https://github.com/teslaworksumn/teslaworks.net/pull/150#issuecomment-27224461
Here's the current create SQL script: http://z.kesdev.com/code/2X070u2E381J I don't have this pushed somewhere yet. Suggestions as to where I should push this?
Re: stutter IDs: This is actually a good thing for semantics IMO.
Check out twnet-create.sql
above.
CREATE TABLE project_needs (
project_need_id serial,
project_id integer
NOT NULL
REFERENCES projects
ON DELETE CASCADE,
need_text text,
display_order integer,
PRIMARY KEY (project_need_id)
);
See the REFERENCES projects
line? Since the project_needs
column name project_id
matches the projects
column name project_id
, we can shorten the command from the following:
REFERENCES projects (project_id)
(or, if we were using just id
)
REFERENCES projects (id)
to:
REFERENCES projects
I also like it because there's only one project_id
in the database, not a couple project_id
s and some id
s that go with them. So the name is DB consistent.
Further reading: [http://www.postgresql.org/docs/8.2/static/ddl-constraints.html#DDL-CONSTRAINTS-FK](Postgres docs)
Here's the current create SQL script: http://z.kesdev.com/code/2X070u2E381J I don't have this pushed somewhere yet. Suggestions as to where I should push this?
Let's make it part of the repo!
Re: stutter IDs: This is actually a good thing for semantics IMO.
Yeah, I'm on board with this now too.
I'll more thoroughly review now!
Let's make it part of the repo!
Done.
I'm getting missing images for Ultralight, ODI. I'm guessing this is because we haven't merged in master and you got the data from the deployed site, right?
We've still got some JSON lurking around: can we move the redirects to the db too?
A few ideas I want to know if you're open to:
PROJECTS_KEY_ORDER = ['id', 'name', 'slug', 'description', 'photo_url', 'past_project']
GET_PROJECTS_QUERY = 'SELECT project_id, name, slug, description, photo_url, past_project FROM projects ORDER BY display_order;'
Becomes
PROJECTS_KEY_ORDER = ['project_id', 'name', 'slug', 'description', 'photo_url', 'past_project']
GET_PROJECTS_QUERY = 'SELECT %s FROM projects ORDER BY display_order;' % ', '.join(PROJECTS_KEY_ORDER)
I think it makes things a little DRYer, easy to add columns.
This happens twice:
for attr_num in xrange(len(project_raw)):
key_name = PROJECTS_KEY_ORDER[attr_num]
project_data[key_name] = project_raw[attr_num]
And this is a little hard to understand at first (but cool):
needs = [text_holder[0] for text_holder in cur.fetchall()]
We could have something like object = make_object(obj_raw, key_order)
instead that generalizes to all those cases.
object = make_object_collection(objs_raw, key_order)
We could actually do this for projects, leaders, needs, AND photos. Do it for projects first, then add leaders and needs and photos to it after it's been make_object_collection
ed first.
projects_data = make_object_collection(projects, key_order)
for project in projects_data:
id = project['project_id']
leaders_raw = # get the leaders by project id
needs_raw = # get the leaders by project id
photos_raw = # get the leaders by project id
project['leaders'] = make_object_collection(photos_raw, PHOTOS_KEY_ORDER)
project['needs'] = make_object_collection(photos_raw, PHOTOS_KEY_ORDER)
project['photos'] = make_object_collection(photos_raw, PHOTOS_KEY_ORDER)
past_project
then and test for the existence of the redirect URL.display_order
columns have unique values for each containing object for, say, project_photo
relationships? I.e. Robomural doesn't have [1, 1, 3]?create-tables.sql
is my new favorite file in this repo. Pretty.Let's open the connection to the database and keep it open forever. Opening connections like that each time ain't good for performance.
I don't see writing to the database in this guy.
:bug:: If we don't have any past projects, we'll hit the db every time. The check if not self.past_projects
will fail and we'll reload.
:snake:: What if we didn't do caching? We are making the assumption we are the only client on this, which is true, but it's kinda bad architecture. We're basically using the db as a backup of RAM.
It also solves our bug, and with an open connection I don't think the performance would be too bad! We could A/B of course.
Object Key Name <==> Table Column Name Generalizing object mapping from SQL rows Generalizing collection mapping from SQL rows
:+1: These are all great ideas. I'm loving DRY. What's the best way to document esoteric utility/generalizing functions like this?
:bug: No Past Projects
Handled in commit 4258475
.
:snake: Caching
Check out this SO discussion.
Normally I'm in favor of caching but I don't think we'll actually run into enough speed issues to have to cache. If we cache wrong, we end up hurting our performance instead. How do you feel about putting off caching until we scale?
(Alternately, you could write a caching engine. I don't mind either way ;)
DB Writing
There isn't any yet :(
Is that still in the scope of this PR? I'm moving existing functionality to Postgres.
Redirect Slugs
I think this problem is complex enough to warrant its own PR. I'll iMessage before iCode.
Can we enforce that the
display_order
columns have unique values for each containing object for, say,project_photo
relationships?
I don't think it matters all that much. ORDER BY
is fine with duplicates. If we write our editor right, we should never run into a situation in which we submit [1, 1, 3]
, etc.
I want to avoid constraints whenever not absolutely necessary. I don't like it as a dev when software tells me I can't do something because the last developer didn't think it made sense.
Re: opening and closing a connection: We have to make the object properly destroyable.
http://stackoverflow.com/questions/865115/how-do-i-correctly-clean-up-a-python-object
I'm going to start using this pattern.
Hey, I think making GET_PROJECTS_QUERY = 'SELECT %s FROM projects ORDER BY display_order;'
may open us to injection. Can you think of a vector by which this might occur?
I'm not saying it's possible, I'm just saying using string interpolation and concatenation inside query strings is a Bad Idea.
You know, despite those suggestions making the code DRYer, I think they complicate the code. Here's why:
We have a couple of uses of these cases. 2 of each:
In addition, the attr array --> attr dict is different in both cases:
for project_raw in projects:
sorts projects into a dict by slug keysfor leader_raw in leaders_raw:
puts leader dicts into an array, ordered by display_order
So, it's hard to DRY the attr array --> attr dict function because they're really two functions.
In addition, the single attr container array --> single attr array is only two lines, and there are only two of them. And my personal guideline for DRY is:
Wahtchu think?
Ima keep plugging at this. I think there are breakthroughs to be made.
Normally I'm in favor of caching but I don't think we'll actually run into enough speed issues to have to cache.
I agree, right now I think we are caching though and I want to nuke it! We "load". I think we should query the each time! Otherwise, we're unintentionally cacheing. Whoops.
Is that still in the scope of this PR?
Nope, nvm
I'm down for killing caching. I like querying each time.
Okay, #161 takes care of caching
Our green merge button is back! Also, you have to pull. Thanks.
After #158 and #163 I consider this ready to go. Good, good work! Thanks for getting it started. :beers:
here goes nothing. :key:
Hey @tylrtrmbl! This is halfway there.
Things we need to do:
config.py