asiripanich / emdash

An e-mission deployer's dashboard. See https://github.com/e-mission/e-mission-docs.
https://emdash.amarin.dev
Other
6 stars 3 forks source link

Filter_loaded_trips #46

Closed allenmichael099 closed 3 years ago

allenmichael099 commented 3 years ago
allenmichael099 commented 3 years ago

There’s more to improve after I get back but I think this is good to go for the moment.

shankari commented 3 years ago

@asiripanich I am going to try to merge this to a branch for now so I can deploy it.

asiripanich commented 3 years ago

@asiripanich I am going to try to merge this to a branch for now so I can deploy it.

Hi Shankari, sure but please make sure they works on your loocal machine. Unfortunately, I don't have time to review the PRs this week.

shankari commented 3 years ago

While testing with this PR on my local laptop and a dataset with ~ 32 trips from one user in the past month, the process was still killed

Running: mod_load_data_server
Running: mod_load_data_server
About to load server calls
Finished query, about to tidy server calls
Finished tidying server calls
Finished loading server calls
About to load participants
merging
Killed

On retrying, it worked

Running: mod_load_data_server
Running: mod_load_data_server
About to load server calls
Finished query, about to tidy server calls
Finished tidying server calls
Finished loading server calls
About to load participants
merging
Finished loading participants
Window_width is 31
About to load trips
Finished query, about to clean trips
Finished cleaning trips
Finished loading trips
About to load locations
Finished query, about to clean locations
Finished cleaning locations
Finished loading locations
About to create trajectories within trips
About to generate trajectories
Trajectories created
Finished creating trajectories within trips
Warning in grSoftVersion() :
  unable to load shared object '/usr/local/lib/R/modules//R_X11.so':
  libXt.so.6: cannot open shared object file: No such file or directory
Warning: Removed 15 rows containing non-finite values (stat_count).

The map still didn't load because none of the entries had labels, so the mode_confirm column was not in the table. Not sure whether I should deploy this yet.

shankari commented 3 years ago

@allenmichael099 how much data did you test this out on?

shankari commented 3 years ago

switching to the month of Jan, and it was killed again.

Window_width is 59
Window_width is 31
About to load trips
Finished query, about to clean trips
Finished cleaning trips
Finished loading trips
About to load locations
Finished query, about to clean locations
Killed

I'm going to hold off on deploying this fix.

shankari commented 3 years ago

Re-disabling the map view using the following patch for deployment since I still want to get the fix for #18

shankari commented 3 years ago

Still getting some kills. Looking at it a bit further - the process appears to be killed at this point:

About to load participants
merging
Killed

on a successful load, we see

About to load participants
merging
Finished loading participants

The related code is

      message("About to load participants")
      data_r$participants <-
        tidy_participants(query_stage_profiles(cons), query_stage_uuids(cons)) %>%
        summarise_trips_without_trips(., cons) %>%
        summarise_server_calls(., data_r$server_calls)
      message("Finished loading participants")

The last merging message is from summarise_server_calls

  message("merging ")
  # merge(participants, usercache_get_summ, usercache_put_summ, by = "user_id", all.x = TRUE)
  merge(participants, usercache_get_summ, by = "user_id", all.x = TRUE) %>%
    merge(., usercache_put_summ, by = "user_id", all.x = TRUE) %>%
    merge(., diary_summ, by = "user_id", all.x = TRUE)

Why is merging with the summaries so data intensive? Each summary only has the first and last entry. One potential fix would be to filter the server calls also by the date range.

shankari commented 3 years ago

This also seems to fail on production, maybe because the range is too short?

Listening on http://0.0.0.0:80
2021-06-24
2021-06-16

No further log messages, no data visible in the dashboard

Screen Shot 2021-06-23 at 10 57 07 PM
shankari commented 3 years ago

@allenmichael099 I have shared a mongodump of the data in the environment that generates this https://github.com/asiripanich/emdash/pull/46#issuecomment-867357184

This is 100% reproducible. We should also talk a bit about thinking through various scenarios and adding unit tests

shankari commented 3 years ago

Potential unit tests:

allenmichael099 commented 3 years ago

@shankari I'm having trouble replicating the 'killed' error. Using the January 2021 data with 36 participants the server calls merged fine.

shankari commented 3 years ago

ok, I will deploy this version and see if I run into the killed issue on production.