kuanb / peartree

peartree: A library for converting transit data into a directed graph for sketch network analysis.
MIT License
201 stars 23 forks source link

For double subsetting, use correct dataframe #170

Closed kuanb closed 3 years ago

kuanb commented 3 years ago

Issue w/ modification from PR #166: Retrieving a subset of dataframe "A" and then applying a mask based off of the a different filter on "A" to the original output "B" risks error if masks from "A" does not line up with "B".

Solution, perform full copy from subset and then filter on that subset to create a second mask. cc @bryanculbertson

bryanculbertson commented 3 years ago

Does this mean in the context of #166 to use a copy() when creating stop_times_by_stop ?

kuanb commented 3 years ago

@bryanculbertson whoops - you're modification actually addressed this correctly. I was looking at an older version of the code in error when I just sat down to review the changes. I think we're good to go here.

It was, in fact, the original code that failed to safely handle the double subset case if/when there was a GTFS feed with a case where the 2nd mask operation had a different length/index than the first:

This one:

constraint_2 = (trips_and_stop_times.direction_id == direction)

Nonetheless, I will still use https://github.com/kuanb/peartree/pull/171 to add a test to just make sure I'm clearly checking this case. Sorry for the noise.