Open knaaptime opened 4 years ago
Hi @knaaptime thanks for the issue, yes a full notebook to replicate the data you are using with the full workflow would be helpful so we can try to replicate the issue.
Hi @knaaptime, could you share a link to the GTFS zip file? I think we need that to run the notebook.
Your diagnosis sounds right to me. I guess the next step is to confirm that Pandana can create the network if the floats are cast to ints, and then track down why this is happening on the UrbanAccess side..
oops, sorry about that. links updated
ok, so the issue was that i was using pandana's Network.from_hdf5
method to read from an existing file, then passing network.node_df to the urbanaccess osm network creator. The problem was that id
was still set as the node_df index rather than being available as a column on the dataframe, generating nan's during the call to integrate_networks and upcasting to/from_int to floats.
I can go ahead and close this, though it was tough to track this down, so I could do a PR to add an explicit check for the required cols or something?
maybe i spoke too soon. I think this line needs to have a .reset_index()
i think id
needs to be both the index and a column on the nodes_df
I wish this stuff was better documented in Pandana. I might have a helpful code example, though.
Last week I made a new Pandana demo notebook, and in Section 1 there's an example of what a typical network.nodes_df
looks like, and then how to build a new network directly from its columns. Maybe we can compare this to what's happening in UrbanAccess..
https://github.com/UDST/pandana/blob/master/examples/Pandana-demo.ipynb
Hi, finally had a chance to look at this more closely.
To summarize what's happening, running ua.integrate_network()
with these data files appears to work but actually creates an edges table where from_int
and to_int
are incorrect and sometimes missing. The missing values means they end up as floats rather than ints, which causes a Pandana error when trying to load the integrated network.
I think we need @sablanchard's eyes on this. @knaaptime reports that it might have to do with the id format in this OSM data -- but comparing that data to the Pandana demo material, it looks completely standard.
@sablanchard, here's a single zip file with everything you need (notebook, which i've updated a bit, plus the data files). Environment info is at the top of the notebook. urbanaccess-issue.zip
I'm thinking we should move ahead with the release as-is, rather than waiting for a fix here. It will be no problem to put out subsequent updates.
thanks for looking into this. I think it comes down to the way that integrate_network
expects the input dfs to be formatted. I can fix the issue by inserting this line before cell 9 in the linked gist:
osm_network.nodes_df['id'] = osm_network.nodes_df.index
its not enough to reset the index (thus making the 'id' column available on the df). Instead both the index and id
variable need to be identical and formatted as ints. osmnet
returns data in this format, but if you have a network you've already used with pandana sitting around (as in my case) you may have the index but not the column. I think the easiest way to ensure this would be to check for the necessary columns/index on the input to integrate_network
and I could add that check if its of interest
btw, thanks for the recent dev pushes and the new example notebook. I the new shortest path stuff is fantastic and makes it really easy to 1) create a pysal spatial weights object based on network distance and 2) to integrate the udst stack with pysal's new access module (which goes a long way toward addressing this and this). I have some new demos that are just about ready to share so i'll ping you when i post them
I'm unable to create a pandana network object after processing OSM and GTFS data with urbanaccess. I'm using this osm network and this gtfs feed. If I use either of those data sources directly, I can sucessfuly insantiate a
pandana.Network
.Once I try to create a multimodal network with 'integrate_network`, it completes successfully:
however, if I try to create a pdna.Network from the integrated data, I get
looking closer at the
ua_network.net_edges
object, I can see that the two columnsto_int
andfrom_int
are actually floats, though looking at the code I cant see why that would be the case. I'm guessing its the underlying reason I cant build a network since the docs seem to indicate pandana needs integers in the from/to cols (though it also seems to work ok with strings if I try and build a network exclusively from the GTFS data) but was curious if you had any insight.I could post the whole notebook if its useful
as_matrix
issue)