Closed nikisix closed 6 years ago
Thanks @nikisix ! A few comments / questions
Review status: 0 of 6 files reviewed at latest revision, 4 unresolved discussions.
doppelganger/allocation.py, line 146 at r1 (raw file):
allocation_inputs = [inputs.NUM_PEOPLE, inputs.NUM_VEHICLES] # Hard-coded for now # Prepend column name to bin name to prevent bin collision hh_columns = list()
hh_columns = []
is more pythonic
doppelganger/allocation.py, line 147 at r1 (raw file):
# Prepend column name to bin name to prevent bin collision hh_columns = list() for i in allocation_inputs:
i
usually means an integer index. If it's not an integer it should have a different name that explains what it is
doppelganger/allocation.py, line 196 at r1 (raw file):
def _format_data(households_data, persons_data): hh_size = pandas.get_dummies(households_data[inputs.NUM_PEOPLE.name]) # Prepend column name to bin name to prevent bin collision
Can you create a helper for this so that you don't have to have the same code here and above?
doppelganger/inputs.py, line 86 at r1 (raw file):
def num_people_discrete(num_people): if is_blank(num_people) or int(num_people) == 0: # match acs (can't have 0-person household) return '0' # UNKNOWN causing problems in the bayes-net segmenter
Can you explain what problems this causes?
Comments from Reviewable
Review status: 0 of 6 files reviewed at latest revision, 4 unresolved discussions, some commit checks failed.
doppelganger/allocation.py, line 146 at r1 (raw file):
`hh_columns = []` is more pythonic
Done.
doppelganger/allocation.py, line 147 at r1 (raw file):
`i` usually means an integer index. If it's not an integer it should have a different name that explains what it is
It's true.
if you_like(a_input): Done.
doppelganger/allocation.py, line 196 at r1 (raw file):
Can you create a helper for this so that you don't have to have the same code here and above?
Done.
doppelganger/inputs.py, line 86 at r1 (raw file):
Can you explain what problems this causes?
The bin-values of the variables we're segmenting on become json keys and are run through a sorter before being written out. The None type throws a sorting error.
Comments from Reviewable
Maybe we can give coveralls another shot today so I can figure out what needs tests around it.
Review status: 0 of 7 files reviewed at latest revision, 4 unresolved discussions, some commit checks failed.
Comments from Reviewable
Review status: 0 of 7 files reviewed at latest revision, 6 unresolved discussions, some commit checks failed.
doppelganger/allocation.py, line 148 at r3 (raw file):
hh_columns = [] for a_input in allocation_inputs: hh_columns += ['_'.join([a_input.name, val]) for val in list(a_input.possible_values)]
can you use _str_broadcast
here too?
doppelganger/inputs.py, line 86 at r1 (raw file):
The bin-values of the variables we're segmenting on become json keys and are run through a sorter before being written out. The None type throws a sorting error.
So the problem is in BayesianNetworkModel.to_json
? Can you please file an issue with an exception trace?
test/test_allocation.py, line 32 at r3 (raw file):
households.to_csv.assert_called_once_with('households_file') def test__str_broadcast(self):
Thanks for adding a test! We generally don't test private methods directly. Can you rework this test to use the public-facing API?
Comments from Reviewable
Review status: 0 of 7 files reviewed at latest revision, 6 unresolved discussions.
test/test_allocation.py, line 32 at r3 (raw file):
Thanks for adding a test! We generally don't test private methods directly. Can you rework this test to use the public-facing API?
Wish I would've seen this. Yea I'll switch it up.
Comments from Reviewable
Review status: 0 of 7 files reviewed at latest revision, 6 unresolved discussions.
doppelganger/allocation.py, line 148 at r3 (raw file):
can you use `_str_broadcast` here too?
Lol yea, meant to do that.
Comments from Reviewable
Review status: 0 of 7 files reviewed at latest revision, 6 unresolved discussions, some commit checks failed.
doppelganger/inputs.py, line 86 at r1 (raw file):
So the problem is in `BayesianNetworkModel.to_json`? Can you please file an issue with an exception trace?
Filed. https://github.com/sidewalklabs/doppelganger/issues/42
Comments from Reviewable
Great, good to go once the test is fixed but I have a question below
Review status: 0 of 7 files reviewed at latest revision, 2 unresolved discussions, some commit checks failed.
doppelganger/inputs.py, line 86 at r1 (raw file):
Filed. https://github.com/sidewalklabs/doppelganger/issues/42
Thank you!
doppelganger/inputs.py, line 86 at r5 (raw file):
def num_people_discrete(num_people): if is_blank(num_people): return UNKNOWN
Do you need to change this for it to work? If it needs to be the other way for now you can leave it like that but add a link to the issue with a TODO
Comments from Reviewable
Review status: 0 of 7 files reviewed at latest revision, 2 unresolved discussions, some commit checks failed.
doppelganger/inputs.py, line 86 at r5 (raw file):
Do you need to change this for it to work? If it needs to be the other way for now you can leave it like that but add a link to the issue with a TODO
I reverted it for now, so it won't break on a puma that contains null household numbers. You'll see it with the test push coming up.
Comments from Reviewable
Just one suggestion for future
Review status: 0 of 7 files reviewed at latest revision, 2 unresolved discussions.
doppelganger/inputs.py, line 86 at r5 (raw file):
I reverted it for now, so it won't break on a puma that contains null household numbers. You'll see it with the test push coming up.
Can you link to the issue with and add a TODO to fix?
test/test_allocation.py, line 49 at r7 (raw file):
def _mock_tract_data(self): return pandas.DataFrame([
In the future, we should try to make our test cases the minimal they need to be for the case we're testing. This looks a bit large and we probably could have done several smaller test cases for different edge cases instead
Comments from Reviewable
Review status: 0 of 7 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.
test/test_allocation.py, line 49 at r7 (raw file):
In the future, we should try to make our test cases the minimal they need to be for the case we're testing. This looks a bit large and we probably could have done several smaller test cases for different edge cases instead
Definitely, the tricky part here is that most of the major machinery in allocation.py are private methods (_allocate_households and _format_data), and our goal was to test these methods using the public api if I recall.
Comments from Reviewable
Use the number of vehicles a household has as an additional marginal control to the solver for increased accuracy around transport variables.
This change is![Reviewable](https://reviewable.io/review_button.svg)