Closed dhersz closed 3 years ago
Actually I think it's worth implementing this now. A good way to implement crop_gtfs()
is to create two datasets, one filtered by shape_id and another by stop_id, and then merging them. But we need a way of making sure that these don't have duplicated entries, and this is where `remove_duplicates()' come in.
Many GTFS files have duplicated entries on them (for example,
spo_gtfs
has anagency.txt
with 2 identical rows and many other duplications spread throughout the tables).prune_gtfs()
will basically remove duplicate entries from the file. It can be used in conjunction with thefilter_***
functions as well, calling it either in the beginning or the end of the filtering process.