digital-wellbeing / pws-prepost

0 stars 0 forks source link

dockerize #6

Closed rpsychologist closed 1 year ago

rpsychologist commented 1 year ago

(Can't open real draft PRs without paying)

I've added the N_ITER, N_THREADS, N_CORES, and N_SUBSET env variables, to prepare for #3

@mvuorre N_SUBSET controls how many pids to randomly filter on. However, it breaks one or more of the figures. Could you take a look? I don't wanna change too many things in ms.qmd.

Quitting from lines 401-610 [fig-data] (ms.qmd)
Error in `filter()`:
ℹ In argument: `pid %in% sample(unique(pid), 3)`.
Caused by error in `sample.int()`:
! cannot take a sample larger than the population when 'replace = FALSE'
Backtrace:
  1. dplyr::mutate(...)
 13. base::sample(unique(pid), 3)
 17. base::sample.int(length(x), size, replace, prob)
Execution halted
mvuorre commented 1 year ago

This looks great.

It breaks that plot because it filters three people who have to have enough data for the figure to make sense. So if you start with a N_SUBSET that doesn't have three people who fill those requirements it fails.

I don't see a good workaround except making those requirements conditional: If N_SUBSET=NaN then conditional, otherwise just take any old random three people. Happy for you to do some ad-hoc fix in ms.qmd if you don't mind?

Also maybe indicate somewhere that for a full analysis user must set N_SUBSET=NaN (I don't see it now but might have missed.)

PS. The current model did not take even an hour with 2000 iterations and full data.