Closed hupili closed 10 years ago
@hupili Do you want to merge these? Feel free to go ahead and do so on master if it is ready to go, no need to issue a pull request.
PR is used as a call for code review. Not only quality per se, but also for each one to know where we are. Just some previous practise. Though, we can omit this.
For this particular case, it's better to have someone else try running data_preparation.py
, because it's a major refactoring.
Ok I'll take a look later tonight.
Agreed we should use pulls once we get stable. On Jan 30, 2014 3:11 PM, "HU, Pili" notifications@github.com wrote:
PR is used as a call for code review. Not only quality per se, but also for each one to know where we are. Just some previous practise. Though, we can omit this.
For this particular case, it's better to have someone else try running data_preparation.py, because it's a major refactoring.
— Reply to this email directly or view it on GitHubhttps://github.com/hxu/hk_census_explorer/pull/14#issuecomment-33665005 .
@hupili at what point is the spreadsheet for raw to canonical name being pulled in? Are you planning on using that spreadsheet to update translation_fix.py
and table_meta_data.py
once it is finalized?
I've also reviewed the spreadsheet. Thanks @clacanzo for your help with that. Most of my changes were formatting (there were some characters that looked like spaces but weren't actually spaces).
Some notes on the style I tried to implement:
I also added a column F and wrote "REMOVE ROW" if the row was an empty row that should not be included in the final results, usually caused by line breaks.
Still some items on the mappings that I am unsure of, so would be good to get another pair of eyes on them too, if anyone wants to review again.
@hupili back to you?
I added a sheet to the spreadsheet that lists special aggregate cells that I found, and estimates the number of data points that we should have in the final dataset.
Immediate problems are fixed. Longer term problems are redirected. Merge to master as extractor baseline
:+1:
This addresses multiple correlated issues. I'll ping back in followup comments. A major refactoring was done on this branch.
data_preparation.py
is the current entrance point. Take a look at the intermediate and final output: http://hupili.net/projects/hk_census/data/ I suppose this is what we need, except for some details. Two in my mind:Suggest to followup the two on separate issues.