GradeIT has been documented to occasionally append extremely high elevation/grade values when appending grade to a dataset (both filtered and unfiltered). This is most likely due to gps errors, but these high grade/elevation values can have a negative effect on our fastsim runs and currently need to be filtered out of our training data.
One such dataset that has been affected by these grade values is the WM1 training data, which has required the application of grade filters. I recently ran a script to sweep through the map-matched WM1 dataset to check the number of points, links, and trips that have grade values above certain thresholds and found the following percentages:
The script calculates these percentages by taking all point level WM1 data and first filtering out all the points that failed to map-match. Then, the total number of points, links (specifically unique link passes, meaning that if there are two of the same road_id on different trips, they are counted as two different link passes), and trips that contain points above each threshold (in this case, 0.10, 0.15, and 0.25 grade) are counted. After processing all of the WM1 days, the percentages are calculated from each of these counts (I will attach the csv for those who are interested)
After out meeting, we came to a few conclusions:
Because of the computation/AU cost of reprocessing the training data upstream during the fastsim step, we have chosen to filter out links with high grade values in the training data during the routee training step. In the future, we should ensure that we have some grade filters added to the fastsim pipeline for future runs
It might be best to try and fix the high grade/elevation values in GradeIT directly, which should prevent the need for using filters in our training data pipelines
I'm opening this issue for us to discuss possible solutions moving forward.
GradeIT has been documented to occasionally append extremely high elevation/grade values when appending grade to a dataset (both filtered and unfiltered). This is most likely due to gps errors, but these high grade/elevation values can have a negative effect on our fastsim runs and currently need to be filtered out of our training data.
One such dataset that has been affected by these grade values is the WM1 training data, which has required the application of grade filters. I recently ran a script to sweep through the map-matched WM1 dataset to check the number of points, links, and trips that have grade values above certain thresholds and found the following percentages:
The script calculates these percentages by taking all point level WM1 data and first filtering out all the points that failed to map-match. Then, the total number of points, links (specifically unique link passes, meaning that if there are two of the same road_id on different trips, they are counted as two different link passes), and trips that contain points above each threshold (in this case, 0.10, 0.15, and 0.25 grade) are counted. After processing all of the WM1 days, the percentages are calculated from each of these counts (I will attach the csv for those who are interested)
After out meeting, we came to a few conclusions:
I'm opening this issue for us to discuss possible solutions moving forward.
counts by day: wm1_grade_stats_by_day.csv