Closed cvanoli closed 5 years ago
Great! Looks to me like this should work. I'm tagging @federicofernandez and @mxndrwgrdnr for more feedback.
Here are some things we should take care of before merging this:
git checkout master
git pull
git checkout data_type_conversion_for_net
git merge master
git push
We should add a unit test confirming it's possible to calculate an aggregation from integer data now. It could go right after this one: test_pandana.py#L84.
Update docstrings to explain that both ints and floats are allowed, but data will be cast to float before the aggregations are calculated: network.py#L198.
Very good suggestions.
On Mon, Apr 15, 2019 at 9:36 PM Sam Maurer notifications@github.com wrote:
Great! Looks to me like this should work. I'm tagging @federicofernandez https://github.com/federicofernandez and @mxndrwgrdnr https://github.com/mxndrwgrdnr for more feedback.
Here are some things we should take care of before merging this:
- To get the current tests passing, we'll have to merge the change from PR #102 https://github.com/UDST/pandana/pull/102 into this branch. This should do it:
git checkout master git pull git checkout data_type_conversion_for_net git merge master git push
1.
We should add a unit test confirming it's possible to calculate an aggregation from integer data now. It could go right after this one: test_pandana.py#L84 https://github.com/UDST/pandana/blob/master/pandana/tests/test_pandana.py#L84 . 2.
Update docstrings to explain that both ints and floats are allowed, but data will be cast to float before the aggregations are calculated: network.py#L198 https://github.com/UDST/pandana/blob/master/pandana/network.py#L198.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/UDST/pandana/pull/108#issuecomment-483467667, or mute the thread https://github.com/notifications/unsubscribe-auth/ABx17fQjmAIHhELazJaq5EMRIT_x7co6ks5vhRr6gaJpZM4cwtws .
@cvanoli Now that the fix from the master branch is merged in, the tests are getting further -- but it turns out that 'float32' doesn't work as the dtype. It looks like the C++ code wants doubles.
Earlier in the same file when the network is initialized, the edge weights are converted using .astype('double')
, so that will probably work for the node values too. network.py#L81-L87
Not sure why the values didn't need to be cast previously -- probably the Pandas default dtype for floats is something that maps automatically to doubles, but 'float32' doesn't.
https://www.numpy.org/devdocs/user/basics.types.html
Also, here's another place in the docstrings where we should specify that ints are now acceptable: network.py#L38
See https://github.com/UDST/pandana/issues/105#issuecomment-484227344 for more perspective on this.
Ok, final summary of what's included here:
This PR automatically casts node values to floats (technically, float64/double), in order to support aggregations of integer values. Edge weights are already cast this way.
This restores behavior from Pandana v0.3. I've adjusted the docstrings to officially support node values and edge weights with any numeric type.
Existing tests are passing now. I'm going to merge this after adding a test where node values are non-float.
Related to #105 the
df[name]
column passed toinitialize_access_var
is forced to be float type because currently, Pandana can only perform an accessibility calculation on a network using a table where the variable being aggregated is type Float.