remix / partridge

A fast, forgiving GTFS reader built on pandas DataFrames
https://partridge.readthedocs.io
MIT License
152 stars 22 forks source link

Busiest day of the service is not exhaustive #71

Closed praneethd7 closed 2 years ago

praneethd7 commented 2 years ago

Describe what you were trying to get done.

Tell us what happened, what went wrong, and what you expected to happen.

What I Did

Example:

import partridge as ptg
ptg.read_busiest_date('gtfs_Portland_2022_feb1.zip')

Output: (datetime.date(2022, 1, 24), frozenset({'A.613', 'D.613', 'Q.613', 'W.613'}))

invisiblefunnel commented 2 years ago

The following service_ids are missing in the output : [B.613,C.613,F.613,E.613,U.613,S.613]

Thanks for sharing your findings @praneethd7. I took a look at the linked GTFS file and it matches the results from partridge.

% curl -L -o trimet.zip https://transitfeeds.com/p/trimet/43/20220201/download
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 18.1M  100 18.1M    0     0  16.6M      0  0:00:01  0:00:01 --:--:-- 31.9M
% unzip trimet.zip calendar.txt calendar_dates.txt 
Archive:  trimet.zip
  inflating: calendar.txt            
  inflating: calendar_dates.txt  
% cat calendar_dates.txt | grep 20220124
A.613,20220124,1
D.613,20220124,1
W.613,20220124,1
Q.613,20220124,1
% cat calendar.txt 
service_id,monday,tuesday,wednesday,thursday,friday,saturday,sunday,start_date,end_date
B.613,0,0,0,0,0,0,0,20220123,20220205
B.614,0,0,0,0,0,0,0,20220206,20220514
C.613,0,0,0,0,0,0,0,20220123,20220205
C.614,0,0,0,0,0,0,0,20220206,20220514
A.613,0,0,0,0,0,0,0,20220123,20220205
A.614,0,0,0,0,0,0,0,20220206,20220514
F.613,0,0,0,0,0,0,0,20220123,20220205
F.614,0,0,0,0,0,0,0,20220206,20220514
E.613,0,0,0,0,0,0,0,20220123,20220205
E.614,0,0,0,0,0,0,0,20220206,20220514
D.613,0,0,0,0,0,0,0,20220123,20220205
D.614,0,0,0,0,0,0,0,20220206,20220514
W.613,0,0,0,0,0,0,0,20220123,20220205
W.614,0,0,0,0,0,0,0,20220206,20220514
Q.613,0,0,0,0,0,0,0,20220123,20220205
Q.614,0,0,0,0,0,0,0,20220206,20220514
U.613,0,0,0,0,0,0,0,20220123,20220205
U.614,0,0,0,0,0,0,0,20220206,20220514
S.613,0,0,0,0,0,0,0,20220123,20220205
S.614,0,0,0,0,0,0,0,20220206,20220514

The missing service_ids include both Light Rail & Bus route_type. For example B.613 (Light Rail) consists of the route 'MAX Red Line' that operates Monday-Friday & Weekends. This is perhaps most busiest line as it connects the Portland Airport. Also U.613 (Bus) consists of 48 routes. All the routes in this service_id can be seen [here].(http://gtfs.transitq.com/TriMet_20220201_20220201/serviceids/U.613)

Take a look at http://gtfs.transitq.com/TriMet_20220201_20220201/serviceids/A.613 and this gist showing that on 20220124 trips for the MAX Red Line are covered by service_id A.613.

praneethd7 commented 2 years ago

Thank you @invisiblefunnel for the quick response and gist. I had an incorrect notion that service_ids in calendars.txt must be operational on all days between the start_date and end_date. Also I was unde the impression that two service_id have no overlap of routes. After your response, I realized that despite 24th January, 2022 missing service_ids : [B.613,C.613,F.613,E.613,U.613,S.613] , the routes in these service_ids are covered by ['A.613', 'D.613', 'Q.613', 'W.613'] (reported by partridge). However, I am still missing how the busiest day is actually reported. Is it the day with the maximum number of service_id in the output of _service_ids_by_date()?

invisiblefunnel commented 2 years ago

However, I am still missing how the busiest day is actually reported. Is it the day with the maximum number of service_id in the output of _service_ids_by_date()?

Partridge uses the number of trips to approximate busyness. The earliest date is returned if multiple dates have the same number of trips.

https://github.com/remix/partridge/blob/df3167ea65742f4d1f4b6a15bbe41b01367fad99/partridge/readers.py#L57-L60

https://github.com/remix/partridge/blob/df3167ea65742f4d1f4b6a15bbe41b01367fad99/partridge/readers.py#L117-L128

https://github.com/remix/partridge/blob/df3167ea65742f4d1f4b6a15bbe41b01367fad99/partridge/readers.py#L222-L229

praneethd7 commented 2 years ago

That makes sense! Thank you so much @invisiblefunnel!