Open shankari opened 1 month ago
to clarify, when I talk about "new PR for reversion", I expect the following
git revert b9b0c347a633a1f44bf697b4cd50e720a279ac09
git revert SHA2
; commitgit revert SHA1
; commitmaster
has been rolled back to Sep 2, before @TeachMeTW started working on the dashboardmaster
is now called future
master
. Once that is done, it is considered the "baseline".We aim to have the baseline on staging and tested tonight. If good, we'll promote to the 3 prod environments we are testing on, and I will perform the first access time @ 8am MT tomorrow.
If any delay, we will push this back to 8am MT Wednesday.
I now see the float error again; we need to cherry-pick that change
ERROR:app_sidebar_collapsible:Exception on /_dash-update-component [POST]
Traceback (most recent call last):
File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/flask/app.py", line 2529, in wsgi_app
response = self.full_dispatch_request()
File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/flask/app.py", line 1825, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/flask/app.py", line 1823, in full_dispatch_request
rv = self.dispatch_request()
File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/flask/app.py", line 1799, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/dash/dash.py", line 1310, in dispatch
ctx.run(
File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/dash/_callback.py", line 442, in add_context
output_value = func(*func_args, **func_kwargs) # %% callback invoked %%
File "/usr/src/app/app_sidebar_collapsible.py", line 318, in update_store_trips
df, user_input_cols = query_confirmed_trips(start_date, end_date, timezone)
File "/usr/src/app/utils/db_utils.py", line 183, in query_confirmed_trips
df["data.primary_ble_sensed_mode"] = df.ble_sensed_summary.apply(get_max_mode_from_summary)
File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/pandas/core/series.py", line 4771, in apply
return SeriesApply(self, func, convert_dtype, args, kwargs).apply()
File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/pandas/core/apply.py", line 1123, in apply
return self.apply_standard()
File "/root/miniconda-23.5.2/envs/emission/lib/python3.9/site-packages/pandas/core/apply.py", line 1174, in apply_standard
mapped = lib.map_infer(
File "pandas/_libs/lib.pyx", line 2924, in pandas._libs.lib.map_infer
File "/usr/src/app/utils/db_utils.py", line 179, in <lambda>
get_max_mode_from_summary = lambda md: max(md["distance"], key=md["distance"].get) if len(md["distance"]) > 0 else "INVALID"
TypeError: 'float' object is not subscriptable
Re-opening since this is not actually done.
While collecting data at 4pm MT (3pm PT) today, I ran into some blank tables, likely due to timeouts.
e.g.
Object { message: "Callback error updating tabs-content.children", html: "<html>\r\n<head><title>504 Gateway Time-out</title></head>\r\n<body>\r\n<center><h1>504 Gateway Time-out</h1></center>\r\n<hr><center>nginx</center>\r\n</body>\r\n</html>\r\n" }
[dash_renderer.v2_15_0m1729614367.min.js:2:95440](https://openpath-stage.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js)
_o https://openpath-stage.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
ui https://openpath-stage.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
r https://openpath-stage.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
r https://openpath-stage.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
p https://openpath-stage.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
nt https://openpath-stage.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
zi https://openpath-stage.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
Not sure if we are recording that anywhere or can record it given that it is an error in the plotly framework
Object { message: "Callback error updating card-active-users.children", html: "<html>\r\n<head><title>504 Gateway Time-out</title></head>\r\n<body>\r\n<center><h1>504 Gateway Time-out</h1></center>\r\n<hr><center>nginx</center>\r\n</body>\r\n</html>\r\n" }
[dash_renderer.v2_15_0m1729614367.min.js:2:95440](https://ccebikes-openpath.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js)
_o https://ccebikes-openpath.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
ui https://ccebikes-openpath.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
r https://ccebikes-openpath.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
r https://ccebikes-openpath.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
p https://ccebikes-openpath.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
nt https://ccebikes-openpath.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
zi https://ccebikes-openpath.nrel.gov/admin/_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_15_0m1729614367.min.js:2
Unlike @shankari's findings; all STM, Ebike, Smart Commute, and Stage loaded fine for me
Tested: 4 PM MT - 4:47 MT
Load order speeds [STM, Smart Commute, Stage, ccebikes ]
-Initial Load
-From sidebar click "Data"
-Click "Trips" tab
-Click "Demographics" tab
-Click "Trajectories" tab
-From sidebar click "Map"
-Select "Density Heatmap"
-Select "Unlabeled"
Everything functioned normally for me
As I reported on Teams this morning:
On ccebikes, UUIDs and Trajectories both timed out after several minutes and failed to display
This held true for the 12pm and 4pm accesses as well. No timeouts on smart commute or stm community.
@JGreenlee what about on stage? Are you testing that periodically as well?
I tested Stage just now. UUIDs eventually loaded. Trips, Trajectories, and Heatmap were empty – I don't think it's due to a timeout, I think it is due to a lack of data in the last week
Starting tomorrow I'll access Stage along with the others.
8 PM MT:
12 AM MT:
On ccebikes, UUIDs and Trajectories both timed out after several minutes and failed to display
This held true for the 12pm and 4pm accesses as well. No timeouts on smart commute or stm community.
I tested Stage just now. UUIDs eventually loaded. Trips, Trajectories, and Heatmap were empty – I don't think it's due to a timeout, I think it is due to a lack of data in the last week
Starting tomorrow I'll access Stage along with the others.
8am MT was consistent with my findings from yesterday.
The active users box is indeed missing on ccebikes; it was likely missing yesterday as well but I didn't notice.
I'm going to make sure my staging phones are on and connected to Wi-Fi today so that we have at least a couple trips on stage
to populate Trips, Trajectories and the map
Tested: 4 PM MT
@shankari I discussed with @JGreenlee in regards to the analysis, he suggested this workflow; what are your thoughts?
I recommend performing exploratory analysis+viz before removing outliers so that you know the data you're working with, you can see the outliers and identify them.
You will have dozens of fine-grained measures for execution time of small tasks. I think you should try to extract a couple higher-level measures that are specifically related to the changes you made.
I suggest something like i) total time to load all figures on the overview and ii) time to load the UUIDs table
Then generate descriptives+ charts for each high-level measure ("basic stuff" like mean execution time by week, dataset size)
You could generate these for fine-grained measures too, for exploratory purposes, but probably won't make it in the paper.
If I remember stats, I think the main analysis you want to perform is probably a two-way repeated measures ANOVA. We are mostly interested in 2 IVs:
The week (week 1 = baseline; week 2 = first batch of improvements; week 3 = both batches of improvements).
Dataset size (small = stm community, medium = smart commute, large = ccebikes)
With the DV being execution time for the high level tasks
This will allow you to test for:
Main effect of week (Did performance change significantly as we rolled out improvements?)
Main effect of dataset size (Does performance significantly degrade with size of the dataset?)
Interaction effect (Does the performance change differently depending on the size of the dataset? (highlight this for scalability)
I don't see what a correlation matrix would be useful for.
I don't see a need for modeling or ML unless (i) you think the time of day has a significant effect that you want to model or (ii) you want a regression model to predict how the system would scale to even larger programs (larger than ccebikes). But I suggest trying to keep your analysis fairly small-scale.
I was a little late starting today because I forgot (I should put a reminder in my calendar). I started at closer to 4:30pm MT.
@TeachMeTW @JGreenlee is your mentor, so I would take his advice 😄 Seriously, if there is something that JGreenlee is unclear about and needs my help, I am happy to step in. But otherwise, I trust Jack's judgement, and am happy to review results once they are ready and provide feedback that way.
BTW, I completely agree on "no ML", EDA != ML I would suggest starting with basic data viz (e.g. box plots) even before using "fancy" stats like ANOVA. Ideally, we could show the difference visually between weeks/datasets
8 PM MT:
Still no trajectories or trips on stage
Interesting. I saw one trip and its associated trajectory on stage (presumably from Jack's phone)
12 AM MT:
I saw my trip on staging yesterday and this morning. Behavior is still fairly consistent on my end, except this morning I noticed that Trajectories on ccebikes loaded successfully instead of timing out.
Seriously, if there is something that JGreenlee is unclear about and needs my help, I am happy to step in. But otherwise, I trust Jack's judgement, and am happy to review results once they are ready and provide feedback that way.
@shankari For the record, I had suggested that @TeachMeTW get your advice in addition to mine since most of my experience with study design + statistical analysis has been in a different domain
4 PM MT:
I forgot today too and am just loading now. @TeachMeTW did you do this at 4pm PT or 4pm MT?
@shankari I started at 3:15 PST, did not finish until 4 PST; mainly ccebike not finishing loading at all
~ Fixing ~
@TeachMeTW did you collect data today? We started this week of data collection on Wednesday, so should continue collecting data with this version today and tomorrow.
@shankari No I did not; I will get to that now. I got confused/thought it was only until friday
9 PM MT:
12 AM MT:
3 MT:
Late comment but, 8 MT:
12 AM MT:
6 PM MT: Tested Update on Staging
I started the next phase of data collection this morning, repeating the same steps as before. I observed batch loading of UUIDs working. I waited for all UUIDs to finish loading before I proceeded to the "Trips" tab.
No 'active minutes' on ccebikes
On ccebikes, I found that the UUIDs tab still took a long time to initialize – even though it now shows only 10 users at first. Then it loads 10 users every batch at a constant rate, so it took a while for all 111 users to be shown.
@shankari @JGreenlee
Given the current setup where only 10 users are loaded initially and subsequent batches load at a steady rate (10 users per 24000 ms), would it be possible to improve testing speed by only loading a subset of batches? This would reduce overall wait time during testing, although we may lose some data points and miss potential issues that might appear in later batches. Do you think this trade-off would be acceptable, or is it important to retain the full load for thorough testing?
4 PM MT:
Dynamic Interval Loading might be something to explore
9 PM MT:
4 AM MT:
@shankari @JGreenlee CCEbike Graphs -- for round 1 of changes (NOT BASELINE)
More graphs:
Results do seem good. But I would like some explanation and narrative around this. Are these results only for staging, or also for other environments? What is the explanation for them, etc? I would suggest filling out the results section of your report with the charts and the associated narrative. The goal is not to make visually appealing charts; the goal is to make it easier for people to understand your results
Methodology:
Implementation and results:
@shankari until stage is reuploaded here are my insights:
data.name,data.reading
admin/data/update_dropdowns_trips/determine_hidden_columns_and_label,1.947862946675444e-06
admin/data/populate_datatable/check_dataframe_type,2.046613198896547e-06
admin/data/update_store_trajectories/prepare_store_data,2.060361875919625e-06
update_card_users/calculate_number_of_users,3.4012816479948702e-06
admin/home/update_card_trips/calculate_number_of_trips,3.9885911499605635e-06
admin/data/update_sub_tab/retrieve_and_process_data,3.998180697332525e-06
admin/home/generate_plot_trips_trend/convert_iso_to_date_only,4.4479934529036935e-06
admin/db_utils/clean_location_data/total_time,3.3919877825276354e-05
admin/db_utils/query_uuids/logging_and_query_initiation,8.000350707152393e-05
admin/data/render_content/handle_demographics_tab,0.00013015080367040355
admin/db_utils/query_demographics/organize_survey_keys,0.00015084072795919687
admin/home/generate_card/generate_card_layout,0.00016447972178438623
admin/home/find_last_get/convert_to_uuid_objects,0.00023495605481522425
admin/db_utils/query_trajectories/convert_date_range_to_timestamps,0.00023877198247646448
admin/db_utils/query_confirmed_trips/convert_date_range_to_timestamps,0.0003273542747809275
admin/home/generate_barplot/update_layout_with_title,0.00037205597217397103
admin/home/compute_trips_trend/extract_date,0.0005048959373316789
admin/home/get_number_of_active_users/calculate_active_users,0.0006949753012685548
admin/db_utils/query_confirmed_trips/rename_columns,0.0006985895775609866
admin/home/update_card_active_users/convert_to_dataframe,0.0007020320802453965
admin/home/generate_plot_sign_up_trend/convert_to_dataframe,0.0007046631796926748
admin/data/update_sub_tab/filter_columns,0.0007352904097691256
admin/home/compute_sign_up_trend/convert_to_datetime,0.0011078711812857723
admin/home/compute_trips_trend/group_by_and_calculate_counts,0.0019007130519995649
admin/data/update_sub_tab/convert_to_dataframe,0.001956743379448906
admin/db_utils/query_confirmed_trips/process_binary_and_named_columns,0.0026572991253141524
admin/db_utils/query_uuids/dataframe_processing,0.0028467976734427793
admin/db_utils/add_user_stats/retrieve_user_profile_data,0.003226581826238717
admin/home/compute_sign_up_trend/group_by_and_calculate_counts,0.0032378398798039703
admin/home/generate_plot_trips_trend/convert_to_dataframe,0.0032821491783935294
admin/db_utils/df_to_filtered_records/filter_dataframe_if_needed,0.003408925289243042
admin/home/compute_trips_trend/convert_to_datetime,0.003680147491926935
admin/db_utils/query_confirmed_trips/process_coordinates_and_modes,0.005306827750419394
admin/db_utils/query_confirmed_trips/expand_and_filter_user_inputs,0.009418276858057305
admin/db_utils/query_uuids/convert_query_result_to_dataframe,0.010353045917858253
admin/home/generate_card/total_time,0.011176629554714
admin/data/update_dropdowns_trips/total_time,0.012006994434029496
admin/db_utils/query_demographics/create_dataframes,0.015817682580615968
admin/home/update_card_active_users/generate_active_users_card,0.01920611160023818
admin/db_utils/query_demographics/process_dataframes,0.019244120024051795
admin/data/render_content/handle_uuids_tab,0.019926290027797222
admin/db_utils/query_confirmed_trips/humanize_distance_and_duration,0.022384994969278132
admin/home/update_card_trips/generate_trips_card,0.022472899088667043
update_card_users/generate_user_card,0.025483410393239044
admin/home/compute_sign_up_trend/total_time,0.02933595221123246
admin/data/update_sub_tab/populate_datatable,0.0318839150521247
admin/data/render_content/total_time,0.03323825279854329
admin/home/compute_trips_trend/total_time,0.035378578287908684
admin/data/populate_datatable/create_datatable,0.03774038002808034
admin/home/generate_barplot/generate_barplot_with_data,0.039363626396305536
admin/home/generate_plot_sign_up_trend/compute_sign_up_trend,0.0400892436870786
admin/home/generate_plot_trips_trend/compute_trips_trend,0.044780456077680116
admin/home/update_card_trips/total_time,0.046317080631406105
admin/db_utils/df_to_filtered_records/convert_to_dict_of_records,0.04962066936440856
admin/db_utils/query_uuids/total_time,0.04976970487803555
admin/data/render_content/handle_trips_tab,0.0507602873720316
admin/home/generate_barplot/initialize_empty_barplot,0.051887884061879025
update_card_users/total_time,0.0561006364000381
admin/data/populate_datatable/total_time,0.05679313991567768
admin/db_utils/add_user_stats/get_last_trip_timestamp,0.06127944804549133
admin/db_utils/add_user_stats/get_first_trip_timestamp,0.06184159415772293
admin/db_utils/add_user_stats/count_labeled_trips,0.06342866654713583
admin/db_utils/df_to_filtered_records/total_time,0.07345906148523829
admin/data/update_sub_tab/total_time,0.0753064348988725
admin/db_utils/query_demographics/query_data,0.08909172358757392
admin/home/generate_plot_trips_trend/generate_barplot,0.12236394326804578
admin/home/generate_barplot/total_time,0.1288078651732151
admin/home/generate_plot_sign_up_trend/generate_barplot,0.1545202970718507
admin/db_utils/query_trajectories/add_mode_string,0.1633187119957438
admin/db_utils/query_demographics/total_time,0.17353401138483432
admin/data/render_content/prepare_final_dataframe_and_return,0.2047745715041297
admin/home/generate_plot_trips_trend/total_time,0.21455150090016262
admin/home/generate_plot_sign_up_trend/total_time,0.2335722325064364
admin/data/update_store_trajectories/filter_records,0.4412800445715402
admin/db_utils/query_trajectories/process_dataframe_columns,0.5254711637183209
admin/db_utils/add_user_stats/count_total_trips,0.8359979697176627
admin/db_utils/query_confirmed_trips/retrieve_aggregate_time_series,2.383731212208139
admin/db_utils/query_confirmed_trips/total_time,2.399663755555122
admin/db_utils/add_user_stats/process_user,2.4734303609223955
admin/db_utils/add_user_stats/get_last_server_call_timestamp,2.890578399274938
data.name,data.reading
admin/home/update_card_active_users/total_time,380.5705834423147
admin/home/update_card_active_users/calculate_active_users,380.51765034232466
admin/home/get_number_of_active_users/total_time,380.50897508538844
admin/home/get_number_of_active_users/find_last_get_entries,380.49126240247415
admin/home/find_last_get/total_time,380.48344891932277
admin/home/find_last_get/query_timeseries_db,380.4540818235763
admin/data/render_content/handle_trajectories_tab,364.7788547482385
admin/data/update_store_trajectories/total_time,364.5321865298467
admin/data/update_store_trajectories/query_trajectories,364.06441769682766
admin/db_utils/query_trajectories/total_time,364.0559167172793
admin/data/render_content/total_time,352.24918736368954
admin/data/render_content/handle_uuids_tab,350.33841586602665
admin/db_utils/add_user_stats/total_time,350.3015870457756
admin/db_utils/query_trajectories/retrieve_entries,213.97728912887717
admin/db_utils/query_trajectories/convert_entries_to_dataframe,149.3421039039905
admin/db_utils/add_user_stats/process_batches,44.96146263926428
admin/db_utils/add_user_stats/processing_loop_stage,44.95224149511222
admin/db_utils/add_user_stats/get_last_trip_timestamp,44.94357279768172
admin/db_utils/add_user_stats/get_last_server_call_timestamp,23.636255698262932
admin/db_utils/add_user_stats/process_user,20.8464001387978
admin/db_utils/query_confirmed_trips/retrieve_aggregate_time_series,8.48705214093778
admin/db_utils/query_confirmed_trips/total_time,8.38841827082712
admin/db_utils/add_user_stats/count_total_trips,6.84013815707495
function,data.reading
add_user_stats,350.3015870457756
clean_location_data,3.3919877825276354e-05
compute_sign_up_trend,0.02933595221123246
compute_trips_trend,0.035378578287908684
df_to_filtered_records,0.07345906148523829
generate_barplot,0.1288078651732151
generate_card,0.011176629554714
generate_plot_sign_up_trend,0.2335722325064364
generate_plot_trips_trend,0.21455150090016262
populate_datatable,0.05679313991567768
query_confirmed_trips,4.53850465386655
query_demographics,0.17353401138483432
query_uuids,0.04976970487803555
render_content,332.586646515266
update_card_trips,0.046317080631406105
update_dropdowns_trips,0.012006994434029496
update_sub_tab,0.0753064348988725
function,data.reading
find_last_get,380.48344891932277
get_number_of_active_users,380.50897508538844
query_trajectories,364.0559167172793
update_card_active_users,380.5705834423147
update_store_trajectories,364.5321865298467
@TeachMeTW I suggested removing the bottom 80% of stats for the pipeline changes not the admin dashboard. The admin dashboard is only launched when a user logs in, so the additional data storage is minimal. The pipeline runs every hour, so the additional data storage is substantial.
I am not opposed to (eventually) dropping the bottom 80% of timing stats in the admin dashboard, but that is not blocking the push to production of the pipeline timing changes.
@shankari please reupload the staging data snapshot; the one from tuesday became corrupted
I also see add_user_stats,350.3015870457756
in the bottom 80% which looks wrong. What does the number there represent?
Plan discussed at meeting
We will use the following environments:
We will access the environments at the following times:
Steps to perform:
We will collect data for one week for the baseline and one week for each change. We will start with reverting the previous batching changes so that we can assess the impact before moving on.
Timeline: