CityofToronto / bdit_king_pilot_dashboard

Dashboard for King St Pilot
GNU General Public License v3.0
4 stars 2 forks source link

Methodology Explainer #114

Open ReedRodgers opened 6 years ago

ReedRodgers commented 6 years ago

Write a high-level explanation of the data manipulation process to include in the internal dashboard.

ReedRodgers commented 6 years ago
radumas commented 6 years ago

For the presentation of it https://stackoverflow.com/questions/19064987/html-css-popup-div-on-text-click

ReedRodgers commented 6 years ago

@aharpalaniTO, @radumas, Any critiques?

image

Also available in full on Murmering Waters

q-schen commented 6 years ago

Maybe align the text to each other? And would be nice to have the X stand out a bit more. Is it possible to fiddle with transparency of the box?

radumas commented 6 years ago

Explainer located in https://github.com/CityofToronto/bdit_king_pilot_dashboard/blob/data_pipeline/bluetooth/README.md and https://github.com/CityofToronto/bdit_king_pilot_dashboard/blob/data_pipeline/bluetooth/data_summary.md

radumas commented 6 years ago

The third type of plot was put together to analyze the impact of removing a date from a given baseline. This plot showed the new baseline overlaid on the old baseline to demonstrate the effect of removing the outlier. It was determined that removing dates with outliers from the baseline could have an impact on the quality of the data.

source "Could have an impact" is an empty statement. Did we do anything following these graphs?

Finally, for each baseline with notable outliers, a scatter plot was produced for the weeks the outliers were found. The percentile band plots were shown for reference, now with the 100th percentile shown as x's, and the last band showing up to the 90th percentile.

Lastly, the baseline comparison graphs were plotted with the outlying dates removed from the new baseline. Each of these sets of figures was analyzed to see if the outlier's impact on the baseline was great enough to warrant it's removal.

source Are these two separate graphs?

Not sure what this following paragraph adds that couldn't be in the numbered list

When looking at the travel time scatterplot for Queen Street University to Yonge, a major change in travel times was noticed at midnight on Saturday, September 30th. The baseline for Saturday was examined using the percentile band plot, and it looked like the event significantly impacted the baseline, pulling it beyond the 10-90 percentile band, and forming a slight upwards trend where no such trend is reflected in the bulk of the data. Because of this, the original baseline was compared to a new baseline with September 30th and October 1st removed, and the new weekend baseline was significantly lower during early morning and midnight. Finally, it was learned that the event occurred during Nuit Blanche, and the Bluetooth readers likely picked up pedestrian phones as there were no cars on the street. Even though this didn't affect the data during peak hours, its impact on the baseline was so large it was excluded from the baseline data. source

Don't fully understand the relationship between this paragraph and the list above it

Many outliers are single points, which are likely due to pedestrian phones being picked up during low traffic during the nighttime hours. The exceptions to this rule are Nuit Blanche, the three large peaks on Adelaide, and the Sunday on Dufferin, which seemed to have an abnormal slowdown event ongoing at the time of the outlier. source

radumas commented 6 years ago