Closed swangcs closed 4 years ago
I have uploaded my scripts for trip segmentations:
filter_gtfs.py
-> filter_gps.py
-> plot_verify.py
siri.20121121.csv
@Ruixinhua @MrTornado24 please feel free to test on other bus lines and other days, let me know how it goes..
I tested on bus line "46", "145" and "15", and I found a small bug when tested on line "15" which small_groups are detected and dDistance is not calculated for each group. I fixed this bug at commit 521bf05 and commit 6149f01. And all the codes work fine on my computer.
@Ruixinhua Well spotted! Good job! Further on this related issue, in commit bfa0e8072de8d45284af18cb99f10ac4a0cb7f72 I reset the "dDistance" value of the first row to "0", when multiple trips occurred in one group (split by time_threshhold). Before this reset, "dDistance" of the first row remains the old value (usually big, as it beyond the time threshold), which lead to inaccurate travelled distance calculation. This condition is then used for filtering.
@Ruixinhua @MrTornado24 please briefly check the latest commit up to 7a62c52f782f043c296471fe9d7934f1fa111384
@swangcs When converting the GPS points in meters, I found there are too many stopping points around the start location which increases the cumulate time and will affect the result of the prediction. In filter_gps.py, it should determine where the bus really begins with and remove the stopping points around the start position.
@swangcs When converting the GPS points in meters, I found there are too many stopping points around the start location which increases the cumulate time and will affect the result of the prediction. In filter_gps.py, it should determine where the bus really begins with and remove the stopping points around the start position.
I tend to think it is normal and we only focus on arrival time at bus stops, deciding exact departure time is not important.
Overall data preprocessing looks fine now, one last comment is from @Ruixinhua 's experiment, only epoch time of departure cannot uniquely determine a bus trip for a certain bus line.
The branch is merged, so I close this issue.
Data pre-processing for one-day data no. 15 bus line has achieved good results so far. To improve the quality of your work before closing this part:
pull request
to merge your branchdata-preprocessing
into ourmaster
branchgit pull
to make your master branch up-to-date, then make a new branch to build your prediction models.