Closed adhi-r closed 1 week ago
@robjhyndman you might know how to do this?
For this you would want to use a length 1 strike_price
, although I see how this result could be surprising. We reached this behaviour after some discussion in #52, however this could/should probably be revisited for further improvement.
library(fpp3)
#> -- Attaching packages ---------------------------------------------- fpp3 0.5 --
#> v tibble 3.2.1 v tsibble 1.1.4
#> v dplyr 1.1.4 v tsibbledata 0.4.1
#> v tidyr 1.3.1 v feasts 0.3.2
#> v lubridate 1.9.3 v fable 0.3.4
#> v ggplot2 3.5.1 v fabletools 0.4.2
#> -- Conflicts ------------------------------------------------- fpp3_conflicts --
#> x lubridate::date() masks base::date()
#> x dplyr::filter() masks stats::filter()
#> x tsibble::intersect() masks base::intersect()
#> x tsibble::interval() masks lubridate::interval()
#> x dplyr::lag() masks stats::lag()
#> x tsibble::setdiff() masks base::setdiff()
#> x tsibble::union() masks base::union()
google_stock <- gafa_stock |>
filter(Symbol == "GOOG", year(Date) >= 2015) |>
mutate(day = row_number()) |>
update_tsibble(index = day, regular = TRUE)
# Filter the year of interest
google_2015 <- google_stock |> filter(year(Date) == 2015)
g_fcasts <- google_2015 |>
model(NAIVE(Close)) |>
forecast(h = 10)
g_fcasts |>
mutate(strike_price = 765,
probability = distributional::cdf(Close, 765))
#> # A fable: 10 x 7 [1]
#> # Key: Symbol, .model [1]
#> Symbol .model day Close .mean strike_price probability
#> <chr> <chr> <dbl> <dist> <dbl> <dbl> <dbl>
#> 1 GOOG NAIVE(Close) 253 N(759, 125) 759. 765 0.708
#> 2 GOOG NAIVE(Close) 254 N(759, 250) 759. 765 0.651
#> 3 GOOG NAIVE(Close) 255 N(759, 376) 759. 765 0.624
#> 4 GOOG NAIVE(Close) 256 N(759, 501) 759. 765 0.608
#> 5 GOOG NAIVE(Close) 257 N(759, 626) 759. 765 0.597
#> 6 GOOG NAIVE(Close) 258 N(759, 751) 759. 765 0.588
#> 7 GOOG NAIVE(Close) 259 N(759, 876) 759. 765 0.582
#> 8 GOOG NAIVE(Close) 260 N(759, 1002) 759. 765 0.577
#> 9 GOOG NAIVE(Close) 261 N(759, 1127) 759. 765 0.572
#> 10 GOOG NAIVE(Close) 262 N(759, 1252) 759. 765 0.569
Created on 2024-06-26 with reprex v2.1.0
Thanks for the response Mitchell! A few hours ago I found that using ‘rowwise()’ actually solves this.
I should also mention that in my minimal example, I have only one strike price so yes you can just hardcode it in like that. In reality, I have many "strikes" i want to compute a cdf on, and i want to do it on many different models and forecasts. rowwise() allows it!
Ah, I also remember from #52 that we had list type inputs to vectorise across values. For example:
library(fpp3)
#> -- Attaching packages ---------------------------------------------- fpp3 0.5 --
#> v tibble 3.2.1 v tsibble 1.1.4
#> v dplyr 1.1.4 v tsibbledata 0.4.1
#> v tidyr 1.3.1 v feasts 0.3.2
#> v lubridate 1.9.3 v fable 0.3.4
#> v ggplot2 3.5.1 v fabletools 0.4.2
#> -- Conflicts ------------------------------------------------- fpp3_conflicts --
#> x lubridate::date() masks base::date()
#> x dplyr::filter() masks stats::filter()
#> x tsibble::intersect() masks base::intersect()
#> x tsibble::interval() masks lubridate::interval()
#> x dplyr::lag() masks stats::lag()
#> x tsibble::setdiff() masks base::setdiff()
#> x tsibble::union() masks base::union()
google_stock <- gafa_stock |>
filter(Symbol == "GOOG", year(Date) >= 2015) |>
mutate(day = row_number()) |>
update_tsibble(index = day, regular = TRUE)
# Filter the year of interest
google_2015 <- google_stock |> filter(year(Date) == 2015)
g_fcasts <- google_2015 |>
model(NAIVE(Close)) |>
forecast(h = 10)
g_fcasts |>
mutate(strike_price = 765,
probability = distributional::cdf(Close, tibble(strike_price)))
#> # A fable: 10 x 7 [1]
#> # Key: Symbol, .model [1]
#> Symbol .model day Close .mean strike_price probability$strike_p~1
#> <chr> <chr> <dbl> <dist> <dbl> <dbl> <dbl>
#> 1 GOOG NAIVE(Cl~ 253 N(759, 125) 759. 765 0.708
#> 2 GOOG NAIVE(Cl~ 254 N(759, 250) 759. 765 0.651
#> 3 GOOG NAIVE(Cl~ 255 N(759, 376) 759. 765 0.624
#> 4 GOOG NAIVE(Cl~ 256 N(759, 501) 759. 765 0.608
#> 5 GOOG NAIVE(Cl~ 257 N(759, 626) 759. 765 0.597
#> 6 GOOG NAIVE(Cl~ 258 N(759, 751) 759. 765 0.588
#> 7 GOOG NAIVE(Cl~ 259 N(759, 876) 759. 765 0.582
#> 8 GOOG NAIVE(Cl~ 260 N(759, 1002) 759. 765 0.577
#> 9 GOOG NAIVE(Cl~ 261 N(759, 1127) 759. 765 0.572
#> 10 GOOG NAIVE(Cl~ 262 N(759, 1252) 759. 765 0.569
#> # i abbreviated name: 1: probability$strike_price
Created on 2024-06-26 with reprex v2.1.0
Hi, I'd like to use the distributional::cdf( ) function in a dplyr pipeline to mutate a column probabilities. I'd think this is possible since it's supposed to be vectorized. But I get this colnames error that i've never seen before.
Can't mutate probabilities from the cdf in a data frame/dplyr pipeline.