UDST / pandana

Pandas Network Analysis by UrbanSim: fast accessibility metrics and shortest paths, using contraction hierarchies :world_map:
http://udst.github.io/pandana
GNU Affero General Public License v3.0
387 stars 83 forks source link

Fixes for #106 #116

Closed alephcero closed 5 years ago

alephcero commented 5 years ago

Fixes #106 https://github.com/UDST/pandana/issues/106

coveralls commented 5 years ago

Coverage Status

Coverage increased (+0.6%) to 91.667% when pulling 73d9d5ce0c35c00b509aeb1660392362c480fd89 on is_106 into 961a7ef8d3b0144b190cb60bbd61845fca6fb314 on master.

smmaurer commented 5 years ago

As i'm reviewing, i realized there's one code migration case that's kind of tricky to handle. @sablanchard @alephcero

In Pandana 0.3, we had: set_pois(str category, pd.Series x_col, pd.Series y_col)

In Pandana 0.4, we have: set_pois(str category, int maxdist, int maxitems, pd.Series x_col, pd.Series y_col)

Apologies for my ad-hoc type annotations, but you get the idea. As of this PR, maxdist and maxitems are taken automatically from init_pois() to allow old code to continue running in both versions. But if the old code is passing values by position without including the parameter name, Pandana 0.4 will still crash at set_pois() when it tries to use a series as an integer. Two options:

  1. We can check if maxdist is a pd.Series and if so, use that value for x_col instead. And same for maxitems and y_col. All parameters are required, so we don't need to worry about anything being missing.

  2. Or we can check if maxdist is a pd.Series and if so, raise an error telling people to update their code. The update would be easy and cross-compatible because all they need to do is pass the arguments by name.

I guess i'd lean toward option 1 because it's so seamless for users. It seems kind of risky to switch parameters around like that, but i can't think of any actual problems it could cause! What do you guys think?

federicofernandez commented 5 years ago
  1. We can check if maxdist is a pd.Series and if so, use that value for x_col instead. And same for maxitems and y_col. All parameters are required, so we don't need to worry about anything being missing.
  2. Or we can check if maxdist is a pd.Series and if so, raise an error telling people to update their code. The update would be easy and cross-compatible because all they need to do is pass the arguments by name.

I guess i'd lean toward option 1 because it's so seamless for users. It seems kind of risky to switch parameters around like that, but i can't think of any actual problems it could cause! What do you guys think?

What about doing (1) but also showing a deprecation warning?

smmaurer commented 5 years ago

What about doing (1) but also showing a deprecation warning?

We'll raise a deprecation warning when init_pois runs, so it will show up either way.

DeprecationWarning: Method init_pois() is no longer needed in Pandana 0.4+ and will be removed in a future version; maxdist and maxitems should now be passed to set_pois()

federicofernandez commented 5 years ago

What about doing (1) but also showing a deprecation warning?

We'll raise a deprecation warning when init_pois runs, so it will show up either way.

DeprecationWarning: Method init_pois() is no longer needed in Pandana 0.4+ and will be removed in a future version; maxdist and maxitems should now be passed to set_pois()

I was referring to explicitly say to the user that we are re-interpreting the parameters, that is the scary part from my point of view. I know that it'll work but if then a user looks the documentation and the code might get confused. But of course, we can also very explicit about this in the documentation too.