cmu-delphi / epiprocess

Tools for basic signal processing in epidemiology
https://cmu-delphi.github.io/epiprocess/
Other
13 stars 9 forks source link

Clarify naming and fix defaults: `epix_slide` is over versions #146

Closed brookslogan closed 1 year ago

brookslogan commented 1 year ago
brookslogan commented 1 year ago

Another thought: do we want to rename epix_slide to epix_version_slide?

brookslogan commented 1 year ago

We should try to avoid using tsibble::guess_frequency if possible for now. Another issue with the full_seq suggestion is that it could be altered by compactify; we'd at least want to document this, by making sure compactify descriptions don't promise full compatibility and noting specifically for the ref_time_values/new-name documentation. This might also lead us to think of a different post-compactify format that records other versions that had updates reported. (This also hints at making epi_archive provide a different interpretation: the current setup is as if we continuously and perfectly monitored some data source for updates within some time range; an alternative setup is to say that we only observed a certain set of versions. It even more hassle for a user to provide this version set explicitly, but we could default to something like the default I was describing. This alternative interpretation would mean that as_of should not allow us to ask for versions that weren't measured or should return NAs. It's a bit more complex, but also more precise/realistic.)

ryantibs commented 1 year ago

Thanks for raising these! I'd like to talk some of these through before you guys get too far along to understand what the problems are.

I tried to design epix_slide() to be as close as possible to epi_slide(), thinking this was a feature, rather than a bug. Thus the way I see it is that the name of the function itself, and all its argument names, being close is a good thing.

Hence I like items 1 and 4 in your todo list, but I'm not sure I like the others which are suggesting we rename things (items 2, 3, 5).

Thinking back again about the output type for epix_slide() (item 6) is a good idea, though I'm not sure we'll arrive at any obvious solutions. I'd be happy to be wrong though.

Should we discuss on a call or here on GitHub? Call might be more fluid. We could combine it with discussing #64 #67 whenever you're ready.

brookslogan commented 1 year ago

We can discuss on call.

Recording a couple vague thoughts to hash out then:

kenmawer commented 1 year ago

How would item 6 even work though? I don't see this happening, unless we also the time_value feature along with the version feature.

Also, for item 1, ref_time_values should actually be ref_versions.

Finally, epix_slide won't grab the latest version; however, this is a bug beyond the scope of this update.

ryantibs commented 1 year ago

I'll leave it up to you Logan to schedule a call whenever you feel is the right time. We can discuss on the call since I think that's the most fluid, given that see some of these ideas differently (and maybe we'll clarify some confusion along the way, from either side).

brookslogan commented 1 year ago

Summary from call:

I don't think many of the tasks in this issue still apply. Closing this and opening other issues.

ryantibs commented 1 year ago

Thanks for the summary. That all sounds right except for one detail:

Default after should be 0 in epi_slide() and I think after shouldn't even be present as an arg in epix_slide().

(Subtleties for forecast data can be discussed and possible modifications made later, as part of the time vs version discussion)