tidyverts / tsibble

Tidy Temporal Data Frames and Tools
https://tsibble.tidyverts.org
GNU General Public License v3.0
528 stars 50 forks source link

interval_pull detects one hour as interval for POSIXct on quarter interval #238

Closed yogat3ch closed 3 years ago

yogat3ch commented 3 years ago

interval_pull appears to detect quarter intervals as one hour intervals. I would expect it to detect either quarter or 3 month intervals. Ideally, it would detect the quarter period and use yearquarter to construct the interval.

x <- structure(
  c(1561867200, 1569816000, 1577682000, 1585540800),
  tzone = "America/New_York",
  class = c("POSIXct",
            "POSIXt")
)
tsibble::interval_pull(x)

<interval[1]> [1] 1h

Something like this will detect quarters in POSIXct:

is_quarter.POSIXt <- function(x) {
  .years <- diff(lubridate::year(x)) < 1
  all(diff(lubridate::month(x)) %% 3 == 0) && length(.years) > sum(.years) 
}

This will provide the multiple of quarter periods:

quarter_multiple.POSIXt <- function(x) {
   unique(diff(lubridate::month(x)) %% 3 + 1)
}

There's probably a better way to do it, but #239 is what I came up with

yogat3ch commented 3 years ago

I've just submitted #241 that does what I would expect (and I think other users might as well) when a POSIXct/Date is provided. It reliably detects what the interval is despite some irregularities in the timeseries and uses inform to let the user know what was guessed. I've added this functionality as the default when as_tsibble(x, regular = FALSE). I'm not sure if this is the best way as I'm not fully aware of the package structure and the potential downstream consequences of doing such, but I imagine that the irregular_interval function can be implemented in such a way as not to interfere with downstream functionality by someone with more knowledgeable of tsibble than myself.

Curious to hear the devs thoughts

Below is a reprex for testing the functionality in #241

test <- list(
  structure(list(time = structure(c(
    1585908240, 1585911600,
    1585911720, 1585912620, 1585912680, 1585912740, 1585913760, 1585913940,
    1585914000, 1585914060, 1585914420, 1585914540, 1585914780, 1585914960,
    1585915020, 1585915080, 1585915140, 1585915200, 1585915680, 1585915860,
    1585915920, 1585916160, 1585916280, 1585916760, 1585917240, 1585917420,
    1585917540, 1585917720, 1585917780, 1585917900, 1585918920, 1585919040,
    1585919280, 1585919400, 1585919460, 1585919580, 1585919760, 1585919940,
    1585920000, 1585920060, 1585920240, 1585920480, 1585920540, 1585920600,
    1585920660, 1585920720, 1585920780, 1585920840, 1585920900, 1585920960,
    1585921020, 1585921080, 1585921140, 1585921200, 1585921260, 1585921320,
    1585921380, 1585921440, 1585921500, 1585921560, 1585921620, 1585921680,
    1585921740, 1585921800, 1585921860, 1585921920, 1585921980, 1585922040,
    1585922100, 1585922160, 1585922220, 1585922280, 1585922340, 1585922400,
    1585922460, 1585922520, 1585922580, 1585922640, 1585922700, 1585922760,
    1585922820, 1585922880, 1585922940, 1585923000, 1585923060, 1585923120,
    1585923180, 1585923240, 1585923300, 1585923360, 1585923420, 1585923480,
    1585923540, 1585923600, 1585923660, 1585923720, 1585923780, 1585923840,
    1585923900, 1585923960, 1585924020, 1585924080, 1585924140, 1585924200,
    1585924260, 1585924320, 1585924380, 1585924440, 1585924500, 1585924560,
    1585924620, 1585924680, 1585924740, 1585924800, 1585924860, 1585924920,
    1585924980, 1585925040, 1585925100, 1585925160, 1585925220, 1585925280,
    1585925340, 1585925400, 1585925460, 1585925520, 1585925580, 1585925640,
    1585925700, 1585925760, 1585925820, 1585925880, 1585925940, 1585926000,
    1585926060, 1585926120, 1585926180, 1585926240, 1585926300, 1585926360,
    1585926420, 1585926480, 1585926540, 1585926600, 1585926660, 1585926720,
    1585926780, 1585926840, 1585926900, 1585926960, 1585927020, 1585927080,
    1585927140, 1585927200, 1585927260, 1585927320, 1585927380, 1585927440,
    1585927500, 1585927560, 1585927620, 1585927680, 1585927740, 1585927800,
    1585927860, 1585927920, 1585927980, 1585928040, 1585928100, 1585928160,
    1585928220, 1585928280, 1585928340, 1585928400, 1585928460, 1585928520,
    1585928580, 1585928640, 1585928700, 1585928760, 1585928820, 1585928880,
    1585928940, 1585929000, 1585929060, 1585929120, 1585929180, 1585929240,
    1585929300, 1585929360, 1585929420, 1585929480, 1585929540, 1585929600,
    1585929660, 1585929720, 1585929780, 1585929840, 1585929900, 1585929960,
    1585930020, 1585930080, 1585930140, 1585930200, 1585930260, 1585930320,
    1585930380, 1585930440, 1585930500, 1585930560, 1585930620, 1585930680,
    1585930740, 1585930800, 1585930860, 1585930920, 1585930980, 1585931040,
    1585931100, 1585931160, 1585931220, 1585931280, 1585931340, 1585931400,
    1585931460, 1585931520, 1585931580, 1585931640, 1585931700, 1585931760,
    1585931820, 1585931880, 1585931940, 1585932000, 1585932060, 1585932120,
    1585932180, 1585932240, 1585932300, 1585932360, 1585932420, 1585932480,
    1585932540, 1585932600, 1585932660, 1585932720, 1585932780, 1585932840,
    1585932900, 1585932960, 1585933020, 1585933080, 1585933140, 1585933200,
    1585933260, 1585933320, 1585933380, 1585933440, 1585933500, 1585933560,
    1585933620, 1585933680, 1585933740, 1585933800, 1585933860, 1585933920,
    1585933980, 1585934040, 1585934100, 1585934160, 1585934220, 1585934280,
    1585934340, 1585934400, 1585934460, 1585934520, 1585934580, 1585934640,
    1585934700, 1585934760, 1585934820, 1585934880, 1585934940, 1585935000,
    1585935060, 1585935120, 1585935180, 1585935240, 1585935300, 1585935360,
    1585935420, 1585935480, 1585935540, 1585935600, 1585935660, 1585935720,
    1585935780, 1585935840, 1585935900, 1585935960, 1585936020, 1585936080,
    1585936140, 1585936200, 1585936260, 1585936320, 1585936380, 1585936440,
    1585936500, 1585936560, 1585936620, 1585936680, 1585936740, 1585936800,
    1585936860, 1585936920, 1585936980, 1585937040, 1585937100, 1585937160,
    1585937220, 1585937280, 1585937340, 1585937400, 1585937460, 1585937520,
    1585937580, 1585937640, 1585937700, 1585937760, 1585937820, 1585937880,
    1585937940, 1585938000, 1585938060, 1585938120, 1585938180, 1585938240,
    1585938300, 1585938360, 1585938420, 1585938480, 1585938540, 1585938600,
    1585938660, 1585938720, 1585938780, 1585938840, 1585938900, 1585938960,
    1585939020, 1585939080, 1585939140, 1585939200, 1585939260, 1585939320,
    1585939380, 1585939440, 1585939500, 1585939560, 1585939620, 1585939680,
    1585939740, 1585939800, 1585939860, 1585939920, 1585939980, 1585940040,
    1585940100, 1585940160, 1585940220, 1585940280, 1585940340, 1585940400,
    1585940460, 1585940520, 1585940580, 1585940640, 1585940700, 1585940760,
    1585940820, 1585940880, 1585940940, 1585941000, 1585941060, 1585941120,
    1585941180, 1585941240, 1585941300, 1585941360, 1585941420, 1585941480,
    1585941540, 1585941600, 1585941660, 1585941720, 1585941780, 1585941840,
    1585941900, 1585941960, 1585942020, 1585942080, 1585942140, 1585942200,
    1585942260, 1585942320, 1585942380, 1585942440, 1585942500, 1585942560,
    1585942620, 1585942680, 1585942740, 1585942800, 1585942860, 1585942920,
    1585942980, 1585943040, 1585943100, 1585943160, 1585943220, 1585943280,
    1585943340, 1585943400, 1585943460, 1585943520, 1585943580, 1585943640,
    1585943700, 1585943760, 1585943820, 1585943880, 1585943940, 1585944000,
    1585944060, 1585944120, 1585944180, 1585944240, 1585944300, 1585944360,
    1585944420, 1585944480, 1585944540, 1585944600, 1585944780, 1585944840,
    1585945200, 1585945260, 1585945320, 1585945380, 1585945440, 1585945800,
    1585946100, 1585946160, 1585946280, 1585946340, 1585946760, 1585947000,
    1585947060, 1585947120, 1585947180, 1585947360, 1585947900, 1585947960,
    1585948920, 1585949100, 1585949640, 1585950360, 1585951380, 1585951440,
    1585951620, 1585951740, 1585951920, 1585951980, 1585953360, 1585954260,
    1585956360, 1585956540, 1585957080, 1585957620, 1585957740, 1585957980,
    1585958100, 1585958340
  ), tzone = "America/New_York", class = c(
    "POSIXct",
    "POSIXt"
  )), n = c(
    17L, 36L, 2L, 9L, 13L, 29L, 23L, 13L, 12L,
    13L, 16L, 3L, 9L, 28L, 14L, 16L, 35L, 51L, 15L, 41L, 11L, 33L,
    7L, 48L, 66L, 9L, 18L, 33L, 25L, 31L, 13L, 43L, 22L, 8L, 8L,
    9L, 15L, 47L, 95L, 14L, 18L, 55L, 31L, 761L, 915L, 507L, 461L,
    363L, 470L, 348L, 349L, 396L, 392L, 256L, 432L, 596L, 331L, 338L,
    317L, 231L, 374L, 256L, 264L, 401L, 300L, 277L, 259L, 290L, 331L,
    359L, 478L, 664L, 323L, 290L, 255L, 282L, 362L, 254L, 207L, 400L,
    260L, 190L, 357L, 355L, 322L, 301L, 250L, 258L, 405L, 571L, 250L,
    309L, 319L, 302L, 308L, 284L, 322L, 562L, 528L, 254L, 222L, 335L,
    194L, 291L, 303L, 189L, 162L, 269L, 284L, 194L, 340L, 208L, 249L,
    206L, 161L, 213L, 178L, 178L, 225L, 169L, 209L, 157L, 204L, 162L,
    366L, 165L, 172L, 459L, 164L, 214L, 191L, 204L, 378L, 231L, 221L,
    153L, 170L, 228L, 243L, 284L, 169L, 185L, 153L, 154L, 168L, 153L,
    160L, 333L, 295L, 252L, 153L, 137L, 339L, 229L, 206L, 307L, 345L,
    165L, 244L, 205L, 383L, 289L, 180L, 386L, 325L, 273L, 141L, 229L,
    203L, 274L, 255L, 170L, 195L, 212L, 172L, 271L, 214L, 1203L,
    174L, 240L, 138L, 212L, 261L, 244L, 166L, 225L, 438L, 228L, 239L,
    190L, 228L, 147L, 158L, 335L, 152L, 243L, 161L, 236L, 149L, 285L,
    336L, 236L, 179L, 262L, 165L, 295L, 224L, 220L, 208L, 244L, 454L,
    213L, 161L, 293L, 267L, 172L, 222L, 288L, 350L, 368L, 274L, 213L,
    222L, 377L, 169L, 145L, 160L, 231L, 300L, 139L, 179L, 155L, 189L,
    127L, 199L, 296L, 127L, 278L, 225L, 177L, 366L, 173L, 256L, 129L,
    191L, 247L, 180L, 149L, 186L, 189L, 175L, 220L, 209L, 199L, 120L,
    771L, 326L, 303L, 197L, 263L, 203L, 328L, 293L, 214L, 160L, 276L,
    303L, 173L, 190L, 178L, 132L, 189L, 195L, 231L, 158L, 169L, 235L,
    179L, 602L, 231L, 177L, 190L, 136L, 314L, 220L, 220L, 216L, 241L,
    224L, 157L, 307L, 273L, 360L, 326L, 234L, 144L, 201L, 263L, 271L,
    184L, 217L, 224L, 316L, 109L, 150L, 480L, 276L, 178L, 1419L,
    260L, 228L, 153L, 292L, 303L, 303L, 255L, 242L, 130L, 214L, 275L,
    202L, 217L, 134L, 363L, 167L, 298L, 227L, 265L, 229L, 254L, 283L,
    219L, 190L, 204L, 205L, 181L, 209L, 199L, 422L, 410L, 197L, 253L,
    259L, 188L, 229L, 218L, 243L, 315L, 153L, 259L, 205L, 214L, 250L,
    237L, 117L, 210L, 239L, 252L, 150L, 169L, 228L, 206L, 213L, 135L,
    259L, 377L, 185L, 182L, 199L, 183L, 351L, 258L, 195L, 221L, 452L,
    263L, 267L, 351L, 218L, 257L, 189L, 277L, 313L, 175L, 312L, 161L,
    227L, 255L, 195L, 239L, 354L, 344L, 276L, 257L, 212L, 376L, 178L,
    201L, 248L, 220L, 228L, 179L, 211L, 305L, 228L, 310L, 146L, 326L,
    346L, 304L, 281L, 325L, 373L, 262L, 322L, 549L, 419L, 338L, 363L,
    361L, 492L, 942L, 565L, 1118L, 574L, 512L, 577L, 598L, 620L,
    945L, 797L, 971L, 1548L, 55L, 16L, 58L, 25L, 50L, 18L, 24L, 30L,
    15L, 32L, 13L, 18L, 27L, 6L, 24L, 28L, 23L, 26L, 17L, 15L, 52L,
    16L, 67L, 4L, 19L, 8L, 15L, 15L, 8L, 7L, 5L, 13L, 4L, 14L, 8L,
    18L, 7L, 3L, 2L, 10L, 3L, 15L, 3L, 2L, 6L, 6L, 8L, 5L, 8L, 7L,
    20L
  )), class = "data.frame", row.names = c(NA, -484L)), structure(list(
    time = structure(c(
      1585908000, 1585911600, 1585912500, 1585913700,
      1585914000, 1585914300, 1585914600, 1585914900, 1585915200,
      1585915500, 1585915800, 1585916100, 1585916700, 1585917000,
      1585917300, 1585917600, 1585917900, 1585918800, 1585919100,
      1585919400, 1585919700, 1585920000, 1585920300, 1585920600,
      1585920900, 1585921200, 1585921500, 1585921800, 1585922100,
      1585922400, 1585922700, 1585923000, 1585923300, 1585923600,
      1585923900, 1585924200, 1585924500, 1585924800, 1585925100,
      1585925400, 1585925700, 1585926000, 1585926300, 1585926600,
      1585926900, 1585927200, 1585927500, 1585927800, 1585928100,
      1585928400, 1585928700, 1585929000, 1585929300, 1585929600,
      1585929900, 1585930200, 1585930500, 1585930800, 1585931100,
      1585931400, 1585931700, 1585932000, 1585932300, 1585932600,
      1585932900, 1585933200, 1585933500, 1585933800, 1585934100,
      1585934400, 1585934700, 1585935000, 1585935300, 1585935600,
      1585935900, 1585936200, 1585936500, 1585936800, 1585937100,
      1585937400, 1585937700, 1585938000, 1585938300, 1585938600,
      1585938900, 1585939200, 1585939500, 1585939800, 1585940100,
      1585940400, 1585940700, 1585941000, 1585941300, 1585941600,
      1585941900, 1585942200, 1585942500, 1585942800, 1585943100,
      1585943400, 1585943700, 1585944000, 1585944300, 1585944600,
      1585945200, 1585945800, 1585946100, 1585946700, 1585947000,
      1585947300, 1585947900, 1585948800, 1585949100, 1585949400,
      1585950300, 1585951200, 1585951500, 1585951800, 1585953300,
      1585954200, 1585956300, 1585956900, 1585957500, 1585957800,
      1585958100
    ), tzone = "America/New_York", class = c(
      "POSIXct",
      "POSIXt"
    )), n = c(
      17L, 38L, 51L, 36L, 25L, 19L, 9L, 93L,
      51L, 15L, 52L, 40L, 48L, 66L, 27L, 58L, 31L, 56L, 22L, 25L,
      62L, 127L, 86L, 3007L, 1955L, 1953L, 1442L, 1527L, 2155L,
      1443L, 1414L, 1486L, 1854L, 1778L, 1533L, 1214L, 1275L, 936L,
      964L, 1324L, 1151L, 1003L, 1034L, 968L, 1176L, 1252L, 1301L,
      1354L, 1097L, 2072L, 1025L, 1301L, 962L, 1127L, 1185L, 1166L,
      1280L, 1242L, 1427L, 1082L, 962L, 1027L, 1197L, 896L, 979L,
      1719L, 1284L, 1126L, 884L, 972L, 1336L, 1211L, 1321L, 1168L,
      1212L, 1193L, 2352L, 1233L, 1042L, 1320L, 1175L, 998L, 1541L,
      1193L, 1081L, 1055L, 966L, 1138L, 1186L, 1554L, 1254L, 1130L,
      1408L, 1224L, 1086L, 1315L, 1629L, 1890L, 2723L, 3379L, 4881L,
      204L, 119L, 58L, 107L, 17L, 150L, 4L, 57L, 8L, 12L, 13L,
      4L, 14L, 8L, 25L, 5L, 13L, 15L, 3L, 8L, 6L, 13L, 8L, 27L
    )
  ), class = "data.frame", row.names = c(
    NA,
    -125L
  )), structure(list(time = structure(c(
    1585908000, 1585911600,
    1585912500, 1585913400, 1585914300, 1585915200, 1585916100, 1585917000,
    1585917900, 1585918800, 1585919700, 1585920600, 1585921500, 1585922400,
    1585923300, 1585924200, 1585925100, 1585926000, 1585926900, 1585927800,
    1585928700, 1585929600, 1585930500, 1585931400, 1585932300, 1585933200,
    1585934100, 1585935000, 1585935900, 1585936800, 1585937700, 1585938600,
    1585939500, 1585940400, 1585941300, 1585942200, 1585943100, 1585944000,
    1585944900, 1585945800, 1585946700, 1585947600, 1585948500, 1585949400,
    1585950300, 1585951200, 1585953000, 1585953900, 1585955700, 1585956600,
    1585957500
  ), tzone = "America/New_York", class = c(
    "POSIXct",
    "POSIXt"
  )), n = c(
    17L, 38L, 51L, 61L, 121L, 118L, 88L, 151L,
    31L, 103L, 275L, 6915L, 5124L, 4343L, 5165L, 3425L, 3439L, 3005L,
    3729L, 4523L, 3288L, 3478L, 3949L, 3071L, 3072L, 4129L, 3192L,
    3700L, 4757L, 3595L, 3714L, 3329L, 3290L, 3938L, 3718L, 4834L,
    10983L, 381L, 107L, 167L, 69L, 12L, 17L, 14L, 8L, 43L, 15L, 3L,
    8L, 6L, 48L
  )), class = "data.frame", row.names = c(NA, -51L)),
  structure(list(time = structure(c(
    1585908000, 1585911600,
    1585915200, 1585918800, 1585922400, 1585926000, 1585929600,
    1585933200, 1585936800, 1585940400, 1585944000, 1585947600,
    1585951200, 1585954800
  ), tzone = "America/New_York", class = c(
    "POSIXct",
    "POSIXt"
  )), n = c(
    17L, 271L, 388L, 12417L, 16372L, 14545L,
    13570L, 15778L, 13928L, 23473L, 724L, 51L, 61L, 62L
  )), class = "data.frame", row.names = c(
    NA,
    -14L
  )), structure(list(time = structure(c(
    1585540800, 1585627200,
    1585713600, 1585800000, 1585886400
  ), tzone = "America/New_York", class = c(
    "POSIXct",
    "POSIXt"
  )), n = c(1L, 1L, 1L, 1L, 1L)), class = "data.frame", row.names = c(
    NA,
    -5L
  )), structure(list(time = structure(c(
    1583038800, 1583643600,
    1584244800, 1584849600, 1585454400, 1586059200
  ), tzone = "America/New_York", class = c(
    "POSIXct",
    "POSIXt"
  )), n = c(5L, 5L, 5L, 5L, 5L, 4L)), class = "data.frame", row.names = c(
    NA,
    -6L
  )), structure(list(time = structure(c(
    1580619600, 1581829200,
    1583038800, 1584244800, 1585454400
  ), tzone = "America/New_York", class = c(
    "POSIXct",
    "POSIXt"
  )), n = c(10L, 9L, 10L, 10L, 9L)), class = "data.frame", row.names = c(
    NA,
    -5L
  )), structure(list(time = structure(c(
    1575781200, 1578200400,
    1580619600, 1583038800, 1585454400
  ), tzone = "America/New_York", class = c(
    "POSIXct",
    "POSIXt"
  )), n = c(18L, 19L, 19L, 20L, 19L)), class = "data.frame", row.names = c(
    NA,
    -5L
  )), structure(list(time = structure(c(
    1577682000, 1580360400,
    1583038800, 1585713600
  ), tzone = "America/New_York", class = c(
    "POSIXct",
    "POSIXt"
  )), n = c(21L, 21L, 22L, 21L)), class = "data.frame", row.names = c(
    NA,
    -4L
  )), structure(list(time = structure(c(
    1567137600, 1572408000,
    1577682000, 1583038800
  ), tzone = "America/New_York", class = c(
    "POSIXct",
    "POSIXt"
  )), n = c(42L, 41L, 42L, 43L)), class = "data.frame", row.names = c(
    NA,
    -4L
  )), structure(list(time = structure(c(
    1561953600, 1569902400,
    1577854800
  ), tzone = "America/New_York", class = c(
    "POSIXct",
    "POSIXt"
  )), n = c(64L, 64L, 62L)), class = "data.frame", row.names = c(
    NA,
    -3L
  )), structure(list(time = structure(c(
    1569816000, 1577682000,
    1585540800
  ), tzone = "America/New_York", class = c(
    "POSIXct",
    "POSIXt"
  )), n = c(63L, 62L, 41L)), class = "data.frame", row.names = c(
    NA,
    -3L
  )), structure(list(time = structure(c(
    1451538000, 1483160400,
    1514696400, 1546232400, 1577768400
  ), tzone = "America/New_York", class = c(
    "POSIXct",
    "POSIXt"
  )), n = c(253L, 251L, 250L, 252L, 102L)), class = "data.frame", row.names = c(
    NA,
    -5L
  ))
)
purrr::map(test, as_tsibble, regular = FALSE)
yogat3ch commented 3 years ago

This functionality could potentially be enabled by passing "guess" to the regular argument if we wanted to make the behavior explicit and separate it clearly in the documentation.

earowang commented 3 years ago

If it's quarterly data, we suggest to use yearquarter to represent instead of Date.