cmu-delphi / epiprocess

Tools for basic signal processing in epidemiology
https://cmu-delphi.github.io/epiprocess/
Other
13 stars 9 forks source link

Draft and discuss naming schemes for `epix_slide` parameters, output #163

Open brookslogan opened 2 years ago

brookslogan commented 2 years ago

See https://github.com/cmu-delphi/epiprocess/issues/146#issuecomment-1192785302; time_value vs version bullet point.

brookslogan commented 2 years ago

"Discuss" will eventually mean running sketches of use by some potential users / other developers (e.g., Evan, Jacob).

brookslogan commented 2 years ago

As discussion on #170 and #171 has brought up, another naming option for the relevant output column(s), besides time_value, version, and both, is ref_time_value.

brookslogan commented 2 years ago

Some remaining discussion points from #146:

  • Separate discussion: should we rename ref_time_values and time_value output column to something involving version, or keep the former and have the latter turn into two duplicate columns with both names? Should we output an epi_archive?

[or should we call the time/version output column ref_time_value?]

  • Separate discussion: should we rename max_version parameter of epix_slide to version?

[Since we're moving to more consistently use an "implicit versioning" scheme, where last-version-of-each-observation-carried-forward is assumed everywhere in archives, this may make sense. However, we might then need to think about the naming or discussion of the $DT$version column.]

Some other existing mismatches between slide operations that we might want to think about:

Advanced usage:

Compactify compatibility

Alternative to implicit versioning interface: explicit versioning interface

brookslogan commented 1 year ago

Another idea to consider here: guess what label to use for the ref_timevalues based on the user output: if they provide a(n epi)df with a time_value column, then use version; else use time_value.