AntoineSoetewey / statsandr

A blog on statistics and R aiming at helping academics and professionals working with data to grasp important concepts in statistics and to apply them in R. See www.statsandr.com
http://statsandr.com/
35 stars 15 forks source link

blog/covid-19-in-belgium/ #29

Closed utterances-bot closed 3 years ago

utterances-bot commented 3 years ago

COVID-19 in Belgium - Stats and R

This article presents an analysis of the Novel COVID-19 Coronavirus in Belgium using R. Feel free to apply it to your own country

https://statsandr.com/blog/covid-19-in-belgium/

AntoineSoetewey commented 3 years ago

Comment written by Christian Soetewey on March 31, 2020 14:38:50:

Even if this is statistics analysis, it is terrifying...Stay safe.
Thanks a lot for the analysis.

AntoineSoetewey commented 3 years ago

Comment written by Christian Soetewey on March 31, 2020 14:38:50:

Even if this is statistics analysis, it is terrifying...Stay safe. Thanks a lot for the analysis.

Comment written by Antoine Soetewey on March 31, 2020 15:00:42:

Thanks, stay safe too!

AntoineSoetewey commented 3 years ago

Comment written by Victor Martin on April 01, 2020 15:31:09:

Thank you very much !!!

I get an error today (yesterday was ok), when adding a date column and incidence data to fitted cummulative data:

fitted_cumulative_incidence <- fitted_cumulative_incidence %>%    mutate(Date = ymd(sir_start_date) + days(t - 1), Country = "Belgium",     cumulative_incident_cases = Infected)

Error: Column cumulative_incident_cases must be length 57 (the number of rows) or one, not 56

AntoineSoetewey commented 3 years ago

Comment written by Victor Martin on April 01, 2020 15:31:09:

Thank you very much !!!

I get an error today (yesterday was ok), when adding a date column and incidence data to fitted cummulative data:

fitted_cumulative_incidence <- fitted_cumulative_incidence %>%    mutate(Date = ymd(sir_start_date) + days(t - 1), Country = "Belgium",     cumulative_incident_cases = Infected)

Error: Column cumulative_incident_cases must be length 57 (the number of rows) or one, not 56

Comment written by Antoine Soetewey on April 01, 2020 20:07:37:

I have updated the code, see here: https://github.com/AntoineSoetewey/statsandr/blob/master/content/blog/2020-03-31-covid-19-in-belgium.Rmd#L211.

This should work now, let me know if not.

Regards, Antoine

AntoineSoetewey commented 3 years ago

Comment written by Victor Martin on April 01, 2020 15:31:09: Thank you very much !!! I get an error today (yesterday was ok), when adding a date column and incidence data to fitted cummulative data: fitted_cumulative_incidence <- fitted_cumulative_incidence %>%    mutate(Date = ymd(sir_start_date) + days(t - 1), Country = "Belgium",     cumulative_incident_cases = Infected) Error: Column cumulative_incident_cases must be length 57 (the number of rows) or one, not 56

Comment written by Antoine Soetewey on April 01, 2020 20:07:37:

I have updated the code, see here: https://github.com/AntoineSoetewey/statsandr/blob/master/content/blog/2020-03-31-covid-19-in-belgium.Rmd#L211.

This should work now, let me know if not.

Regards, Antoine

Comment written by Victor Martin on April 01, 2020 20:54:30:

It works perfectly, thank you very much once more for everything.

Victor

AntoineSoetewey commented 3 years ago

Comment written by Michael Owusu on April 14, 2020 01:02:40:

Hi Antoine, 

I tried running your code and got the error below:

Error in checkInput(y, times, func, rtol, atol, jacfunc, tcrit, hmin,  :     'hmax' must be a non-negative value 
In addition: Warning message:  In max(abs(diff(times))) :   Error in checkInput(y, times, func, rtol, atol, jacfunc, tcrit, hmin,  :     'hmax' must be a non-negative value

Is there something I'm doing wrong? I'm using Macbook Prof and Rstudio. 

Thanks

AntoineSoetewey commented 3 years ago

Comment written by Michael Owusu on April 14, 2020 01:02:40:

Hi Antoine, 

I tried running your code and got the error below:

Error in checkInput(y, times, func, rtol, atol, jacfunc, tcrit, hmin,  :     'hmax' must be a non-negative value  In addition: Warning message:  In max(abs(diff(times))) :   Error in checkInput(y, times, func, rtol, atol, jacfunc, tcrit, hmin,  :     'hmax' must be a non-negative value

Is there something I'm doing wrong? I'm using Macbook Prof and Rstudio. 

Thanks

Comment written by Antoine Soetewey on April 15, 2020 15:30:45:

Dear Michael, 

Are you running the code for Belgium or for another country?

AntoineSoetewey commented 3 years ago

Comment written by José Moniz Fernandes on April 16, 2020 01:01:45:

Hi, 

Other problem. Why the data isn't update? 

I apply for Cabo Verde data the code

https://statsandr.com/blog/covid-19-in-belgium/

SIR <- function(time, state, parameters) {    par <- as.list(c(state, parameters))    with(par, {      dS <- -beta I S / N      dI <- beta I S / N - gamma I      dR <- gamma I      list(c(dS, dI, dR))    })  } #to create a vector with the daily cumulative incidence for Belgium, from February 4 (when our daily incidence data starts)  # devtools::install_github("RamiKrispin/coronavirus")  library(coronavirus)  data(coronavirus) %&gt;% <- magrittr::%&gt;% # extract the cumulative incidence  df <- coronavirus %>%    dplyr::filter(Country.Region == "Cabo Verde") %>%    dplyr::group_by(date, type) %>%    dplyr::summarise(total = sum(cases, na.rm = TRUE)) %>%    tidyr::pivot_wider(      names_from = type,      values_from = total    ) %>%    dplyr::arrange(date) %>%    dplyr::ungroup() %>%    dplyr::mutate(active = confirmed - death - recovered) %>%    dplyr::mutate(      confirmed_cum = cumsum(confirmed),      death_cum = cumsum(death),      recovered_cum = cumsum(recovered),      active_cum = cumsum(active)    ) # put the daily cumulative incidence numbers for Belgium from  # Feb 4 to March 30 into a vector called Infected  library(lubridate)  Infected <- subset(df, date >= ymd("2020-03-20") & date <= ymd("2020-04-12"))$active_cum # Create an incrementing Day vector the same length as our  # cases vector  Day <- 1:(length(Infected)) # now specify initial values for N, S, I and R  N <- 556586  init <- c(    S = N - Infected[1],    I = Infected[1],    R = 0  ) # define a function to calculate the residual sum of squares  # (RSS), passing in parameters beta and gamma that are to be  # optimised for the best fit to the incidence data  RSS <- function(parameters) {    names(parameters) <- c("beta", "gamma")    out <- ode(y = init, times = Day, func = SIR, parms = parameters)    fit <- out[, 3]    sum((Infected - fit)^2)  } # now find the values of beta and gamma that give the  # smallest RSS, which represents the best fit to the data.  # Start with values of 0.5 for each, and constrain them to  # the interval 0 to 1.0 # install.packages("deSolve")  library(deSolve) Opt <- optim(c(0.5, 0.5),               RSS,               method = "L-BFGS-B",               lower = c(0, 0),               upper = c(1, 1)  ) # check for convergence  Opt$message Opt_par <- setNames(Opt$par, c("beta", "gamma"))  Opt_par sir_start_date <- "2020-03-20" # time in days for predictions  t <- 1:as.integer(ymd("2020-04-13") - ymd(sir_start_date)) # get the fitted values from our SIR model  fitted_cumulative_incidence <- data.frame(ode(    y = init, times = t,    func = SIR, parms = Opt_par  )) # add a Date column and the observed incidence data  library(dplyr)  fitted_cumulative_incidence <- fitted_cumulative_incidence %>%    mutate(      Date = ymd(sir_start_date) + days(t - 1),      Country = "Cabo Verde",      cumulative_incident_cases = Infected    ) # plot the data  library(ggplot2)  fitted_cumulative_incidence %>%    ggplot(aes(x = Date)) +    geom_line(aes(y = I), colour = "red") +    geom_point(aes(y = cumulative_incident_cases), colour = "blue") +    labs(      y = "Cumulative incidence",      title = "COVID-19 fitted vs observed cumulative incidence, Belgium",      subtitle = "(Red = fitted incidence from SIR model, blue = observed incidence)"    ) +    theme_minimal() fitted_cumulative_incidence %>%    ggplot(aes(x = Date)) +    geom_line(aes(y = I), colour = "red") +    geom_point(aes(y = cumulative_incident_cases), colour = "blue") +    labs(      y = "Cumulative incidence",      title = "COVID-19 fitted vs observed cumulative incidence, Belgium",      subtitle = "(Red = fitted incidence from SIR model, blue = observed incidence)"    ) +    theme_minimal() +    scale_y_log10(labels = scales::comma) #We can compute it in R:   Opt_par    R0 <- as.numeric(Opt_par[1] / Opt_par[2])    R0   # time in days for predictions    t <- 1:120   # get the fitted values from our SIR model    fitted_cumulative_incidence <- data.frame(ode(      y = init, times = t,      func = SIR, parms = Opt_par    ))   # add a Date column and join the observed incidence data    fitted_cumulative_incidence <- fitted_cumulative_incidence %>%      mutate(        Date = ymd(sir_start_date) + days(t - 1),        Country = "Cabo Verde",        cumulative_incident_cases = c(Infected, rep(NA, length(t) - length(Infected)))      )   # plot the data    fitted_cumulative_incidence %>%      ggplot(aes(x = Date)) +      geom_line(aes(y = I), colour = "red") +      geom_line(aes(y = S), colour = "black") +      geom_line(aes(y = R), colour = "green") +      geom_point(aes(y = cumulative_incident_cases),                 colour = "blue"      ) +      scale_y_continuous(labels = scales::comma) +      labs(y = "Persons", title = "COVID-19 fitted vs observed cumulative incidence, Belgium") +      scale_colour_manual(name = "", values = c(        red = "red", black = "black",        green = "green", blue = "blue"      ), labels = c(        "Susceptible",        "Recovered", "Observed incidence", "Infectious"      )) +      theme_minimal()   # plot the data    fitted_cumulative_incidence %>%      ggplot(aes(x = Date)) +      geom_line(aes(y = I, colour = "red")) +      geom_line(aes(y = S, colour = "black")) +      geom_line(aes(y = R, colour = "green")) +      geom_point(aes(y = cumulative_incident_cases, colour = "blue")) +      scale_y_log10(labels = scales::comma) +      labs(        y = "Persons",        title = "COVID-19 fitted vs observed cumulative incidence, Belgium"      ) +      scale_colour_manual(        name = "",        values = c(red = "red", black = "black", green = "green", blue = "blue"),        labels = c("Susceptible", "Observed incidence", "Recovered", "Infectious")      ) +      theme_minimal()   fit <- fitted_cumulative_incidence   # peak of pandemic    fit[fit$I == max(fit$I), c("Date", "I")]   # severe cases    max_infected <- max(fit$I)    max_infected / 5   # cases with need for intensive care    max_infected 0.06      # deaths with supposed 0.7% fatality rate    max_infected 0.007

AntoineSoetewey commented 3 years ago

Comment written by José Moniz Fernandes on April 16, 2020 01:01:45:

Hi, 

Other problem. Why the data isn't update? 

I apply for Cabo Verde data the code

https://statsandr.com/blog/covid-19-in-belgium/

SIR <- function(time, state, parameters) {    par <- as.list(c(state, parameters))    with(par, {      dS <- -beta I S / N      dI <- beta I S / N - gamma I      dR <- gamma I      list(c(dS, dI, dR))    })  } #to create a vector with the daily cumulative incidence for Belgium, from February 4 (when our daily incidence data starts)  # devtools::install_github("RamiKrispin/coronavirus")  library(coronavirus)  data(coronavirus) %&gt;% <- magrittr::%&gt;% # extract the cumulative incidence  df <- coronavirus %>%    dplyr::filter(Country.Region == "Cabo Verde") %>%    dplyr::group_by(date, type) %>%    dplyr::summarise(total = sum(cases, na.rm = TRUE)) %>%    tidyr::pivot_wider(      names_from = type,      values_from = total    ) %>%    dplyr::arrange(date) %>%    dplyr::ungroup() %>%    dplyr::mutate(active = confirmed - death - recovered) %>%    dplyr::mutate(      confirmed_cum = cumsum(confirmed),      death_cum = cumsum(death),      recovered_cum = cumsum(recovered),      active_cum = cumsum(active)    ) # put the daily cumulative incidence numbers for Belgium from  # Feb 4 to March 30 into a vector called Infected  library(lubridate)  Infected <- subset(df, date >= ymd("2020-03-20") & date <= ymd("2020-04-12"))$active_cum # Create an incrementing Day vector the same length as our  # cases vector  Day <- 1:(length(Infected)) # now specify initial values for N, S, I and R  N <- 556586  init <- c(    S = N - Infected[1],    I = Infected[1],    R = 0  ) # define a function to calculate the residual sum of squares  # (RSS), passing in parameters beta and gamma that are to be  # optimised for the best fit to the incidence data  RSS <- function(parameters) {    names(parameters) <- c("beta", "gamma")    out <- ode(y = init, times = Day, func = SIR, parms = parameters)    fit <- out[, 3]    sum((Infected - fit)^2)  } # now find the values of beta and gamma that give the  # smallest RSS, which represents the best fit to the data.  # Start with values of 0.5 for each, and constrain them to  # the interval 0 to 1.0 # install.packages("deSolve")  library(deSolve) Opt <- optim(c(0.5, 0.5),               RSS,               method = "L-BFGS-B",               lower = c(0, 0),               upper = c(1, 1)  ) # check for convergence  Opt$message Opt_par <- setNames(Opt$par, c("beta", "gamma"))  Opt_par sir_start_date <- "2020-03-20" # time in days for predictions  t <- 1:as.integer(ymd("2020-04-13") - ymd(sir_start_date)) # get the fitted values from our SIR model  fitted_cumulative_incidence <- data.frame(ode(    y = init, times = t,    func = SIR, parms = Opt_par  )) # add a Date column and the observed incidence data  library(dplyr)  fitted_cumulative_incidence <- fitted_cumulative_incidence %>%    mutate(      Date = ymd(sir_start_date) + days(t - 1),      Country = "Cabo Verde",      cumulative_incident_cases = Infected    ) # plot the data  library(ggplot2)  fitted_cumulative_incidence %>%    ggplot(aes(x = Date)) +    geom_line(aes(y = I), colour = "red") +    geom_point(aes(y = cumulative_incident_cases), colour = "blue") +    labs(      y = "Cumulative incidence",      title = "COVID-19 fitted vs observed cumulative incidence, Belgium",      subtitle = "(Red = fitted incidence from SIR model, blue = observed incidence)"    ) +    theme_minimal() fitted_cumulative_incidence %>%    ggplot(aes(x = Date)) +    geom_line(aes(y = I), colour = "red") +    geom_point(aes(y = cumulative_incident_cases), colour = "blue") +    labs(      y = "Cumulative incidence",      title = "COVID-19 fitted vs observed cumulative incidence, Belgium",      subtitle = "(Red = fitted incidence from SIR model, blue = observed incidence)"    ) +    theme_minimal() +    scale_y_log10(labels = scales::comma) #We can compute it in R:   Opt_par    R0 <- as.numeric(Opt_par[1] / Opt_par[2])    R0   # time in days for predictions    t <- 1:120   # get the fitted values from our SIR model    fitted_cumulative_incidence <- data.frame(ode(      y = init, times = t,      func = SIR, parms = Opt_par    ))   # add a Date column and join the observed incidence data    fitted_cumulative_incidence <- fitted_cumulative_incidence %>%      mutate(        Date = ymd(sir_start_date) + days(t - 1),        Country = "Cabo Verde",        cumulative_incident_cases = c(Infected, rep(NA, length(t) - length(Infected)))      )   # plot the data    fitted_cumulative_incidence %>%      ggplot(aes(x = Date)) +      geom_line(aes(y = I), colour = "red") +      geom_line(aes(y = S), colour = "black") +      geom_line(aes(y = R), colour = "green") +      geom_point(aes(y = cumulative_incident_cases),                 colour = "blue"      ) +      scale_y_continuous(labels = scales::comma) +      labs(y = "Persons", title = "COVID-19 fitted vs observed cumulative incidence, Belgium") +      scale_colour_manual(name = "", values = c(        red = "red", black = "black",        green = "green", blue = "blue"      ), labels = c(        "Susceptible",        "Recovered", "Observed incidence", "Infectious"      )) +      theme_minimal()   # plot the data    fitted_cumulative_incidence %>%      ggplot(aes(x = Date)) +      geom_line(aes(y = I, colour = "red")) +      geom_line(aes(y = S, colour = "black")) +      geom_line(aes(y = R, colour = "green")) +      geom_point(aes(y = cumulative_incident_cases, colour = "blue")) +      scale_y_log10(labels = scales::comma) +      labs(        y = "Persons",        title = "COVID-19 fitted vs observed cumulative incidence, Belgium"      ) +      scale_colour_manual(        name = "",        values = c(red = "red", black = "black", green = "green", blue = "blue"),        labels = c("Susceptible", "Observed incidence", "Recovered", "Infectious")      ) +      theme_minimal()   fit <- fitted_cumulative_incidence   # peak of pandemic    fit[fit$I == max(fit$I), c("Date", "I")]   # severe cases    max_infected <- max(fit$I)    max_infected / 5   # cases with need for intensive care    max_infected 0.06      # deaths with supposed 0.7% fatality rate    max_infected 0.007

Comment written by Antoine Soetewey on April 16, 2020 09:08:47:

Dear José,

To update the dataset, you need to use the update_datasets() function or reinstall the {coronavirus} package with devtools::install_github("RamiKrispin/coronavirus").  After doing this, check that you have the latest available data before running your code.

Hope this helps.

Regards, Antoine

AntoineSoetewey commented 3 years ago

Comment written by Mohammad Abdullah on April 16, 2020 14:36:56:

Thank you Anton for this remarkable work I congratulate you about it , 

I have some questions regarding : 

Q.1 max_infected / 5? (5) represents what exactly is it (5) Months?? 
Q.2 # cases with need for intensive care  max_infected * 0.06 ? the number (0.06) what represents? 
Q.3 How can we calculate the fatality rate?

AntoineSoetewey commented 3 years ago

Comment written by Mohammad Abdullah on April 16, 2020 14:36:56:

Thank you Anton for this remarkable work I congratulate you about it , 

I have some questions regarding : 

Q.1 max_infected / 5? (5) represents what exactly is it (5) Months??  Q.2 # cases with need for intensive care  max_infected * 0.06 ? the number (0.06) what represents?  Q.3 How can we calculate the fatality rate?

Comment written by Antoine Soetewey on April 16, 2020 14:48:13:

Thanks for your feedback Mohammad.

Q.1 one out of every five cases is considered severe (it's equivalent than max_infected * 0.2
Q.2 6% of the cases are expected to be in need of intensive care 
Q.3 again, we use a 0.7% fatality rate so we multiply by 0.007

These 3 figures were chosen to be in line with other resources I have found online. However, they may be different in your country (and may even vary for Belgium in the future) so if you believe that they are under or overestimated you can choose your own.

Hope this helps. 

Regards, Antoine

AntoineSoetewey commented 3 years ago

Comment written by Mohammad Abdullah on April 16, 2020 14:36:56: Thank you Anton for this remarkable work I congratulate you about it ,  I have some questions regarding :  Q.1 max_infected / 5? (5) represents what exactly is it (5) Months??  Q.2 # cases with need for intensive care  max_infected * 0.06 ? the number (0.06) what represents?  Q.3 How can we calculate the fatality rate?

Comment written by Antoine Soetewey on April 16, 2020 14:48:13:

Thanks for your feedback Mohammad.

Q.1 one out of every five cases is considered severe (it's equivalent than max_infected * 0.2)  Q.2 6% of the cases are expected to be in need of intensive care  Q.3 again, we use a 0.7% fatality rate so we multiply by 0.007

These 3 figures were chosen to be in line with other resources I have found online. However, they may be different in your country (and may even vary for Belgium in the future) so if you believe that they are under or overestimated you can choose your own.

Hope this helps. 

Regards, Antoine

Comment written by Mohammad Abdullah on April 16, 2020 18:30:50:

Hi Again Antoin, thank you for your first question answering.

Please look at the data of my country (saudi Arabia) : https://drive.google.com/drive/folders/16ZZIKnZ71Jsk2vloxSbRHg24l3vw_mio?usp=sharing 

I have problems as the following  

1) the date is not sorting correctly and the accumilated value have the wrong positon in the correct place of the date after doing the accumilation process.  e.g.: the final accumilated confirmed cases= 4462 should be in the date=12-April-2020 but it write it on (31-March-2020)
2) I have problem In the date when I convert it to (ymd) format, so always (Infected) = 0 , e.g > Infected  numeric(0) The code I wrote it here: DataSa<-as.data.frame(Saudi_Arabia)  #OR  DataSa<-as.data.frame(Saudi_Arabia_txt)  #DataSa$date<-as.Date(as.character(DataSa$date), format = "%d")  #df<-as.data.frame(DataSa)  #devtools::install_github("RamiKrispin/coronavirus")  #library(coronavirus)  #data(coronavirus)  #data(DataSa)  # extract the cumulative incidence  df1 <- DataSa %>%    dplyr::filter(Province.State == "Saudi Arabia") %>%    dplyr::group_by(date, type) %>%    dplyr::summarise(total = sum(cases, na.rm = TRUE)) %>%    tidyr::pivot_wider(      names_from = type,      values_from = total    ) %>%    dplyr::arrange(date) %>%    dplyr::ungroup() %>%    dplyr::mutate(active = confirmed - death - recovered) %>%    dplyr::mutate(      confirmed_cum = cumsum(confirmed),      death_cum = cumsum(death),      recovered_cum = cumsum(recovered),      active_cum = cumsum(active)    ) # put the daily cumulative incidence numbers for Belgium from library(lubridate)  Infected <- subset(df1, date >= ymd("2020-03-02") & date <= ymd("2020-04-06"))$active_cum  #OR  #Infected <- subset(df1, date >= dym("02-03-2020") & date <= dym("06-04-2020"))$active_cum

Please advise.

AntoineSoetewey commented 3 years ago

Comment written by Mohammad Abdullah on April 16, 2020 14:36:56: Thank you Anton for this remarkable work I congratulate you about it ,  I have some questions regarding :  Q.1 max_infected / 5? (5) represents what exactly is it (5) Months??  Q.2 # cases with need for intensive care  max_infected * 0.06 ? the number (0.06) what represents?  Q.3 How can we calculate the fatality rate?

Comment written by Antoine Soetewey on April 16, 2020 14:48:13: Thanks for your feedback Mohammad. Q.1 one out of every five cases is considered severe (it's equivalent than max_infected * 0.2)  Q.2 6% of the cases are expected to be in need of intensive care  Q.3 again, we use a 0.7% fatality rate so we multiply by 0.007 These 3 figures were chosen to be in line with other resources I have found online. However, they may be different in your country (and may even vary for Belgium in the future) so if you believe that they are under or overestimated you can choose your own. Hope this helps.  Regards, Antoine

Comment written by Mohammad Abdullah on April 16, 2020 18:30:50:

Hi Again Antoin, thank you for your first question answering.

Please look at the data of my country (saudi Arabia) : https://drive.google.com/drive/folders/16ZZIKnZ71Jsk2vloxSbRHg24l3vw_mio?usp=sharing 

I have problems as the following  

  1. the date is not sorting correctly and the accumilated value have the wrong positon in the correct place of the date after doing the accumilation process.  e.g.: the final accumilated confirmed cases= 4462 should be in the date=12-April-2020 but it write it on (31-March-2020)
  2. I have problem In the date when I convert it to (ymd) format, so always (Infected) = 0 , e.g > Infected  numeric(0) The code I wrote it here: DataSa<-as.data.frame(Saudi_Arabia)  #OR  DataSa<-as.data.frame(Saudi_Arabia_txt)  #DataSa$date<-as.Date(as.character(DataSa$date), format = "%d")  #df<-as.data.frame(DataSa)  #devtools::install_github("RamiKrispin/coronavirus")  #library(coronavirus)  #data(coronavirus)  #data(DataSa)  # extract the cumulative incidence  df1 <- DataSa %>%    dplyr::filter(Province.State == "Saudi Arabia") %>%    dplyr::group_by(date, type) %>%    dplyr::summarise(total = sum(cases, na.rm = TRUE)) %>%    tidyr::pivot_wider(      names_from = type,      values_from = total    ) %>%    dplyr::arrange(date) %>%    dplyr::ungroup() %>%    dplyr::mutate(active = confirmed - death - recovered) %>%    dplyr::mutate(      confirmed_cum = cumsum(confirmed),      death_cum = cumsum(death),      recovered_cum = cumsum(recovered),      active_cum = cumsum(active)    ) # put the daily cumulative incidence numbers for Belgium from library(lubridate)  Infected <- subset(df1, date >= ymd("2020-03-02") & date <= ymd("2020-04-06"))$active_cum  #OR  #Infected <- subset(df1, date >= dym("02-03-2020") & date <= dym("06-04-2020"))$active_cum

Please advise.

Comment written by Mohammad Abdullah on April 16, 2020 18:33:39:

Hi Antonie

Here is the link of my Csv , Txt file: https://drive.google.com/open?id=16ZZIKnZ71Jsk2vloxSbRHg24l3vw_mio

AntoineSoetewey commented 3 years ago

Comment written by José Moniz Fernandes on April 16, 2020 01:01:45: Hi,  Other problem. Why the data isn't update?  I apply for Cabo Verde data the code https://statsandr.com/blog/covid-19-in-belgium/ SIR <- function(time, state, parameters) {    par <- as.list(c(state, parameters))    with(par, {      dS <- -beta I S / N      dI <- beta I S / N - gamma I      dR <- gamma I      list(c(dS, dI, dR))    })  } #to create a vector with the daily cumulative incidence for Belgium, from February 4 (when our daily incidence data starts)  # devtools::install_github("RamiKrispin/coronavirus")  library(coronavirus)  data(coronavirus) %&gt;% <- magrittr::%&gt;% # extract the cumulative incidence  df <- coronavirus %>%    dplyr::filter(Country.Region == "Cabo Verde") %>%    dplyr::group_by(date, type) %>%    dplyr::summarise(total = sum(cases, na.rm = TRUE)) %>%    tidyr::pivot_wider(      names_from = type,      values_from = total    ) %>%    dplyr::arrange(date) %>%    dplyr::ungroup() %>%    dplyr::mutate(active = confirmed - death - recovered) %>%    dplyr::mutate(      confirmed_cum = cumsum(confirmed),      death_cum = cumsum(death),      recovered_cum = cumsum(recovered),      active_cum = cumsum(active)    ) # put the daily cumulative incidence numbers for Belgium from  # Feb 4 to March 30 into a vector called Infected  library(lubridate)  Infected <- subset(df, date >= ymd("2020-03-20") & date <= ymd("2020-04-12"))$active_cum # Create an incrementing Day vector the same length as our  # cases vector  Day <- 1:(length(Infected)) # now specify initial values for N, S, I and R  N <- 556586  init <- c(    S = N - Infected[1],    I = Infected[1],    R = 0  ) # define a function to calculate the residual sum of squares  # (RSS), passing in parameters beta and gamma that are to be  # optimised for the best fit to the incidence data  RSS <- function(parameters) {    names(parameters) <- c("beta", "gamma")    out <- ode(y = init, times = Day, func = SIR, parms = parameters)    fit <- out[, 3]    sum((Infected - fit)^2)  } # now find the values of beta and gamma that give the  # smallest RSS, which represents the best fit to the data.  # Start with values of 0.5 for each, and constrain them to  # the interval 0 to 1.0 # install.packages("deSolve")  library(deSolve) Opt <- optim(c(0.5, 0.5),               RSS,               method = "L-BFGS-B",               lower = c(0, 0),               upper = c(1, 1)  ) # check for convergence  Opt$message Opt_par <- setNames(Opt$par, c("beta", "gamma"))  Opt_par sir_start_date <- "2020-03-20" # time in days for predictions  t <- 1:as.integer(ymd("2020-04-13") - ymd(sir_start_date)) # get the fitted values from our SIR model  fitted_cumulative_incidence <- data.frame(ode(    y = init, times = t,    func = SIR, parms = Opt_par  )) # add a Date column and the observed incidence data  library(dplyr)  fitted_cumulative_incidence <- fitted_cumulative_incidence %>%    mutate(      Date = ymd(sir_start_date) + days(t - 1),      Country = "Cabo Verde",      cumulative_incident_cases = Infected    ) # plot the data  library(ggplot2)  fitted_cumulative_incidence %>%    ggplot(aes(x = Date)) +    geom_line(aes(y = I), colour = "red") +    geom_point(aes(y = cumulative_incident_cases), colour = "blue") +    labs(      y = "Cumulative incidence",      title = "COVID-19 fitted vs observed cumulative incidence, Belgium",      subtitle = "(Red = fitted incidence from SIR model, blue = observed incidence)"    ) +    theme_minimal() fitted_cumulative_incidence %>%    ggplot(aes(x = Date)) +    geom_line(aes(y = I), colour = "red") +    geom_point(aes(y = cumulative_incident_cases), colour = "blue") +    labs(      y = "Cumulative incidence",      title = "COVID-19 fitted vs observed cumulative incidence, Belgium",      subtitle = "(Red = fitted incidence from SIR model, blue = observed incidence)"    ) +    theme_minimal() +    scale_y_log10(labels = scales::comma) #We can compute it in R:   Opt_par    R0 <- as.numeric(Opt_par[1] / Opt_par[2])    R0   # time in days for predictions    t <- 1:120   # get the fitted values from our SIR model    fitted_cumulative_incidence <- data.frame(ode(      y = init, times = t,      func = SIR, parms = Opt_par    ))   # add a Date column and join the observed incidence data    fitted_cumulative_incidence <- fitted_cumulative_incidence %>%      mutate(        Date = ymd(sir_start_date) + days(t - 1),        Country = "Cabo Verde",        cumulative_incident_cases = c(Infected, rep(NA, length(t) - length(Infected)))      )   # plot the data    fitted_cumulative_incidence %>%      ggplot(aes(x = Date)) +      geom_line(aes(y = I), colour = "red") +      geom_line(aes(y = S), colour = "black") +      geom_line(aes(y = R), colour = "green") +      geom_point(aes(y = cumulative_incident_cases),                 colour = "blue"      ) +      scale_y_continuous(labels = scales::comma) +      labs(y = "Persons", title = "COVID-19 fitted vs observed cumulative incidence, Belgium") +      scale_colour_manual(name = "", values = c(        red = "red", black = "black",        green = "green", blue = "blue"      ), labels = c(        "Susceptible",        "Recovered", "Observed incidence", "Infectious"      )) +      theme_minimal()   # plot the data    fitted_cumulative_incidence %>%      ggplot(aes(x = Date)) +      geom_line(aes(y = I, colour = "red")) +      geom_line(aes(y = S, colour = "black")) +      geom_line(aes(y = R, colour = "green")) +      geom_point(aes(y = cumulative_incident_cases, colour = "blue")) +      scale_y_log10(labels = scales::comma) +      labs(        y = "Persons",        title = "COVID-19 fitted vs observed cumulative incidence, Belgium"      ) +      scale_colour_manual(        name = "",        values = c(red = "red", black = "black", green = "green", blue = "blue"),        labels = c("Susceptible", "Observed incidence", "Recovered", "Infectious")      ) +      theme_minimal()   fit <- fitted_cumulative_incidence   # peak of pandemic    fit[fit$I == max(fit$I), c("Date", "I")]   # severe cases    max_infected <- max(fit$I)    max_infected / 5   # cases with need for intensive care    max_infected 0.06      # deaths with supposed 0.7% fatality rate    max_infected 0.007

Comment written by Antoine Soetewey on April 16, 2020 09:08:47:

Dear José,

To update the dataset, you need to use the update_datasets() function or reinstall the {coronavirus} package with devtools::install_github("RamiKrispin/coronavirus").  After doing this, check that you have the latest available data before running your code.

Hope this helps.

Regards, Antoine

Comment written by José Moniz Fernandes on April 17, 2020 00:59:20:

Thanks.

AntoineSoetewey commented 3 years ago

Comment written by José Moniz Fernandes on April 16, 2020 01:01:45: Hi,  Other problem. Why the data isn't update?  I apply for Cabo Verde data the code https://statsandr.com/blog/covid-19-in-belgium/ SIR <- function(time, state, parameters) {    par <- as.list(c(state, parameters))    with(par, {      dS <- -beta I S / N      dI <- beta I S / N - gamma I      dR <- gamma I      list(c(dS, dI, dR))    })  } #to create a vector with the daily cumulative incidence for Belgium, from February 4 (when our daily incidence data starts)  # devtools::install_github("RamiKrispin/coronavirus")  library(coronavirus)  data(coronavirus) %&gt;% <- magrittr::%&gt;% # extract the cumulative incidence  df <- coronavirus %>%    dplyr::filter(Country.Region == "Cabo Verde") %>%    dplyr::group_by(date, type) %>%    dplyr::summarise(total = sum(cases, na.rm = TRUE)) %>%    tidyr::pivot_wider(      names_from = type,      values_from = total    ) %>%    dplyr::arrange(date) %>%    dplyr::ungroup() %>%    dplyr::mutate(active = confirmed - death - recovered) %>%    dplyr::mutate(      confirmed_cum = cumsum(confirmed),      death_cum = cumsum(death),      recovered_cum = cumsum(recovered),      active_cum = cumsum(active)    ) # put the daily cumulative incidence numbers for Belgium from  # Feb 4 to March 30 into a vector called Infected  library(lubridate)  Infected <- subset(df, date >= ymd("2020-03-20") & date <= ymd("2020-04-12"))$active_cum # Create an incrementing Day vector the same length as our  # cases vector  Day <- 1:(length(Infected)) # now specify initial values for N, S, I and R  N <- 556586  init <- c(    S = N - Infected[1],    I = Infected[1],    R = 0  ) # define a function to calculate the residual sum of squares  # (RSS), passing in parameters beta and gamma that are to be  # optimised for the best fit to the incidence data  RSS <- function(parameters) {    names(parameters) <- c("beta", "gamma")    out <- ode(y = init, times = Day, func = SIR, parms = parameters)    fit <- out[, 3]    sum((Infected - fit)^2)  } # now find the values of beta and gamma that give the  # smallest RSS, which represents the best fit to the data.  # Start with values of 0.5 for each, and constrain them to  # the interval 0 to 1.0 # install.packages("deSolve")  library(deSolve) Opt <- optim(c(0.5, 0.5),               RSS,               method = "L-BFGS-B",               lower = c(0, 0),               upper = c(1, 1)  ) # check for convergence  Opt$message Opt_par <- setNames(Opt$par, c("beta", "gamma"))  Opt_par sir_start_date <- "2020-03-20" # time in days for predictions  t <- 1:as.integer(ymd("2020-04-13") - ymd(sir_start_date)) # get the fitted values from our SIR model  fitted_cumulative_incidence <- data.frame(ode(    y = init, times = t,    func = SIR, parms = Opt_par  )) # add a Date column and the observed incidence data  library(dplyr)  fitted_cumulative_incidence <- fitted_cumulative_incidence %>%    mutate(      Date = ymd(sir_start_date) + days(t - 1),      Country = "Cabo Verde",      cumulative_incident_cases = Infected    ) # plot the data  library(ggplot2)  fitted_cumulative_incidence %>%    ggplot(aes(x = Date)) +    geom_line(aes(y = I), colour = "red") +    geom_point(aes(y = cumulative_incident_cases), colour = "blue") +    labs(      y = "Cumulative incidence",      title = "COVID-19 fitted vs observed cumulative incidence, Belgium",      subtitle = "(Red = fitted incidence from SIR model, blue = observed incidence)"    ) +    theme_minimal() fitted_cumulative_incidence %>%    ggplot(aes(x = Date)) +    geom_line(aes(y = I), colour = "red") +    geom_point(aes(y = cumulative_incident_cases), colour = "blue") +    labs(      y = "Cumulative incidence",      title = "COVID-19 fitted vs observed cumulative incidence, Belgium",      subtitle = "(Red = fitted incidence from SIR model, blue = observed incidence)"    ) +    theme_minimal() +    scale_y_log10(labels = scales::comma) #We can compute it in R:   Opt_par    R0 <- as.numeric(Opt_par[1] / Opt_par[2])    R0   # time in days for predictions    t <- 1:120   # get the fitted values from our SIR model    fitted_cumulative_incidence <- data.frame(ode(      y = init, times = t,      func = SIR, parms = Opt_par    ))   # add a Date column and join the observed incidence data    fitted_cumulative_incidence <- fitted_cumulative_incidence %>%      mutate(        Date = ymd(sir_start_date) + days(t - 1),        Country = "Cabo Verde",        cumulative_incident_cases = c(Infected, rep(NA, length(t) - length(Infected)))      )   # plot the data    fitted_cumulative_incidence %>%      ggplot(aes(x = Date)) +      geom_line(aes(y = I), colour = "red") +      geom_line(aes(y = S), colour = "black") +      geom_line(aes(y = R), colour = "green") +      geom_point(aes(y = cumulative_incident_cases),                 colour = "blue"      ) +      scale_y_continuous(labels = scales::comma) +      labs(y = "Persons", title = "COVID-19 fitted vs observed cumulative incidence, Belgium") +      scale_colour_manual(name = "", values = c(        red = "red", black = "black",        green = "green", blue = "blue"      ), labels = c(        "Susceptible",        "Recovered", "Observed incidence", "Infectious"      )) +      theme_minimal()   # plot the data    fitted_cumulative_incidence %>%      ggplot(aes(x = Date)) +      geom_line(aes(y = I, colour = "red")) +      geom_line(aes(y = S, colour = "black")) +      geom_line(aes(y = R, colour = "green")) +      geom_point(aes(y = cumulative_incident_cases, colour = "blue")) +      scale_y_log10(labels = scales::comma) +      labs(        y = "Persons",        title = "COVID-19 fitted vs observed cumulative incidence, Belgium"      ) +      scale_colour_manual(        name = "",        values = c(red = "red", black = "black", green = "green", blue = "blue"),        labels = c("Susceptible", "Observed incidence", "Recovered", "Infectious")      ) +      theme_minimal()   fit <- fitted_cumulative_incidence   # peak of pandemic    fit[fit$I == max(fit$I), c("Date", "I")]   # severe cases    max_infected <- max(fit$I)    max_infected / 5   # cases with need for intensive care    max_infected 0.06      # deaths with supposed 0.7% fatality rate    max_infected 0.007

Comment written by Antoine Soetewey on April 16, 2020 09:08:47: Dear José, To update the dataset, you need to use the update_datasets() function or reinstall the {coronavirus} package with devtools::install_github("RamiKrispin/coronavirus").  After doing this, check that you have the latest available data before running your code. Hope this helps. Regards, Antoine

Comment written by José Moniz Fernandes on April 17, 2020 00:59:20:

Thanks.

Comment written by José Moniz Fernandes on April 17, 2020 01:12:08:

Why # severe cases  max_infected <- max(fit$I)  max_infected / 5 or why five (5)?

AntoineSoetewey commented 3 years ago

Comment written by José Moniz Fernandes on April 16, 2020 01:01:45: Hi,  Other problem. Why the data isn't update?  I apply for Cabo Verde data the code https://statsandr.com/blog/covid-19-in-belgium/ SIR <- function(time, state, parameters) {    par <- as.list(c(state, parameters))    with(par, {      dS <- -beta I S / N      dI <- beta I S / N - gamma I      dR <- gamma I      list(c(dS, dI, dR))    })  } #to create a vector with the daily cumulative incidence for Belgium, from February 4 (when our daily incidence data starts)  # devtools::install_github("RamiKrispin/coronavirus")  library(coronavirus)  data(coronavirus) %&gt;% <- magrittr::%&gt;% # extract the cumulative incidence  df <- coronavirus %>%    dplyr::filter(Country.Region == "Cabo Verde") %>%    dplyr::group_by(date, type) %>%    dplyr::summarise(total = sum(cases, na.rm = TRUE)) %>%    tidyr::pivot_wider(      names_from = type,      values_from = total    ) %>%    dplyr::arrange(date) %>%    dplyr::ungroup() %>%    dplyr::mutate(active = confirmed - death - recovered) %>%    dplyr::mutate(      confirmed_cum = cumsum(confirmed),      death_cum = cumsum(death),      recovered_cum = cumsum(recovered),      active_cum = cumsum(active)    ) # put the daily cumulative incidence numbers for Belgium from  # Feb 4 to March 30 into a vector called Infected  library(lubridate)  Infected <- subset(df, date >= ymd("2020-03-20") & date <= ymd("2020-04-12"))$active_cum # Create an incrementing Day vector the same length as our  # cases vector  Day <- 1:(length(Infected)) # now specify initial values for N, S, I and R  N <- 556586  init <- c(    S = N - Infected[1],    I = Infected[1],    R = 0  ) # define a function to calculate the residual sum of squares  # (RSS), passing in parameters beta and gamma that are to be  # optimised for the best fit to the incidence data  RSS <- function(parameters) {    names(parameters) <- c("beta", "gamma")    out <- ode(y = init, times = Day, func = SIR, parms = parameters)    fit <- out[, 3]    sum((Infected - fit)^2)  } # now find the values of beta and gamma that give the  # smallest RSS, which represents the best fit to the data.  # Start with values of 0.5 for each, and constrain them to  # the interval 0 to 1.0 # install.packages("deSolve")  library(deSolve) Opt <- optim(c(0.5, 0.5),               RSS,               method = "L-BFGS-B",               lower = c(0, 0),               upper = c(1, 1)  ) # check for convergence  Opt$message Opt_par <- setNames(Opt$par, c("beta", "gamma"))  Opt_par sir_start_date <- "2020-03-20" # time in days for predictions  t <- 1:as.integer(ymd("2020-04-13") - ymd(sir_start_date)) # get the fitted values from our SIR model  fitted_cumulative_incidence <- data.frame(ode(    y = init, times = t,    func = SIR, parms = Opt_par  )) # add a Date column and the observed incidence data  library(dplyr)  fitted_cumulative_incidence <- fitted_cumulative_incidence %>%    mutate(      Date = ymd(sir_start_date) + days(t - 1),      Country = "Cabo Verde",      cumulative_incident_cases = Infected    ) # plot the data  library(ggplot2)  fitted_cumulative_incidence %>%    ggplot(aes(x = Date)) +    geom_line(aes(y = I), colour = "red") +    geom_point(aes(y = cumulative_incident_cases), colour = "blue") +    labs(      y = "Cumulative incidence",      title = "COVID-19 fitted vs observed cumulative incidence, Belgium",      subtitle = "(Red = fitted incidence from SIR model, blue = observed incidence)"    ) +    theme_minimal() fitted_cumulative_incidence %>%    ggplot(aes(x = Date)) +    geom_line(aes(y = I), colour = "red") +    geom_point(aes(y = cumulative_incident_cases), colour = "blue") +    labs(      y = "Cumulative incidence",      title = "COVID-19 fitted vs observed cumulative incidence, Belgium",      subtitle = "(Red = fitted incidence from SIR model, blue = observed incidence)"    ) +    theme_minimal() +    scale_y_log10(labels = scales::comma) #We can compute it in R:   Opt_par    R0 <- as.numeric(Opt_par[1] / Opt_par[2])    R0   # time in days for predictions    t <- 1:120   # get the fitted values from our SIR model    fitted_cumulative_incidence <- data.frame(ode(      y = init, times = t,      func = SIR, parms = Opt_par    ))   # add a Date column and join the observed incidence data    fitted_cumulative_incidence <- fitted_cumulative_incidence %>%      mutate(        Date = ymd(sir_start_date) + days(t - 1),        Country = "Cabo Verde",        cumulative_incident_cases = c(Infected, rep(NA, length(t) - length(Infected)))      )   # plot the data    fitted_cumulative_incidence %>%      ggplot(aes(x = Date)) +      geom_line(aes(y = I), colour = "red") +      geom_line(aes(y = S), colour = "black") +      geom_line(aes(y = R), colour = "green") +      geom_point(aes(y = cumulative_incident_cases),                 colour = "blue"      ) +      scale_y_continuous(labels = scales::comma) +      labs(y = "Persons", title = "COVID-19 fitted vs observed cumulative incidence, Belgium") +      scale_colour_manual(name = "", values = c(        red = "red", black = "black",        green = "green", blue = "blue"      ), labels = c(        "Susceptible",        "Recovered", "Observed incidence", "Infectious"      )) +      theme_minimal()   # plot the data    fitted_cumulative_incidence %>%      ggplot(aes(x = Date)) +      geom_line(aes(y = I, colour = "red")) +      geom_line(aes(y = S, colour = "black")) +      geom_line(aes(y = R, colour = "green")) +      geom_point(aes(y = cumulative_incident_cases, colour = "blue")) +      scale_y_log10(labels = scales::comma) +      labs(        y = "Persons",        title = "COVID-19 fitted vs observed cumulative incidence, Belgium"      ) +      scale_colour_manual(        name = "",        values = c(red = "red", black = "black", green = "green", blue = "blue"),        labels = c("Susceptible", "Observed incidence", "Recovered", "Infectious")      ) +      theme_minimal()   fit <- fitted_cumulative_incidence   # peak of pandemic    fit[fit$I == max(fit$I), c("Date", "I")]   # severe cases    max_infected <- max(fit$I)    max_infected / 5   # cases with need for intensive care    max_infected 0.06      # deaths with supposed 0.7% fatality rate    max_infected 0.007

Comment written by Antoine Soetewey on April 16, 2020 09:08:47: Dear José, To update the dataset, you need to use the update_datasets() function or reinstall the {coronavirus} package with devtools::install_github("RamiKrispin/coronavirus").  After doing this, check that you have the latest available data before running your code. Hope this helps. Regards, Antoine

Comment written by José Moniz Fernandes on April 17, 2020 00:59:20: Thanks.

Comment written by José Moniz Fernandes on April 17, 2020 01:12:08:

Why # severe cases  max_infected <- max(fit$I)  max_infected / 5 or why five (5)?

Comment written by José Moniz Fernandes on April 17, 2020 01:14:21:

Why   # cases with need for intensive care  max_infected * 0.06  ??

AntoineSoetewey commented 3 years ago

Comment written by José Moniz Fernandes on April 16, 2020 01:01:45: Hi,  Other problem. Why the data isn't update?  I apply for Cabo Verde data the code https://statsandr.com/blog/covid-19-in-belgium/ SIR <- function(time, state, parameters) {    par <- as.list(c(state, parameters))    with(par, {      dS <- -beta I S / N      dI <- beta I S / N - gamma I      dR <- gamma I      list(c(dS, dI, dR))    })  } #to create a vector with the daily cumulative incidence for Belgium, from February 4 (when our daily incidence data starts)  # devtools::install_github("RamiKrispin/coronavirus")  library(coronavirus)  data(coronavirus) %&gt;% <- magrittr::%&gt;% # extract the cumulative incidence  df <- coronavirus %>%    dplyr::filter(Country.Region == "Cabo Verde") %>%    dplyr::group_by(date, type) %>%    dplyr::summarise(total = sum(cases, na.rm = TRUE)) %>%    tidyr::pivot_wider(      names_from = type,      values_from = total    ) %>%    dplyr::arrange(date) %>%    dplyr::ungroup() %>%    dplyr::mutate(active = confirmed - death - recovered) %>%    dplyr::mutate(      confirmed_cum = cumsum(confirmed),      death_cum = cumsum(death),      recovered_cum = cumsum(recovered),      active_cum = cumsum(active)    ) # put the daily cumulative incidence numbers for Belgium from  # Feb 4 to March 30 into a vector called Infected  library(lubridate)  Infected <- subset(df, date >= ymd("2020-03-20") & date <= ymd("2020-04-12"))$active_cum # Create an incrementing Day vector the same length as our  # cases vector  Day <- 1:(length(Infected)) # now specify initial values for N, S, I and R  N <- 556586  init <- c(    S = N - Infected[1],    I = Infected[1],    R = 0  ) # define a function to calculate the residual sum of squares  # (RSS), passing in parameters beta and gamma that are to be  # optimised for the best fit to the incidence data  RSS <- function(parameters) {    names(parameters) <- c("beta", "gamma")    out <- ode(y = init, times = Day, func = SIR, parms = parameters)    fit <- out[, 3]    sum((Infected - fit)^2)  } # now find the values of beta and gamma that give the  # smallest RSS, which represents the best fit to the data.  # Start with values of 0.5 for each, and constrain them to  # the interval 0 to 1.0 # install.packages("deSolve")  library(deSolve) Opt <- optim(c(0.5, 0.5),               RSS,               method = "L-BFGS-B",               lower = c(0, 0),               upper = c(1, 1)  ) # check for convergence  Opt$message Opt_par <- setNames(Opt$par, c("beta", "gamma"))  Opt_par sir_start_date <- "2020-03-20" # time in days for predictions  t <- 1:as.integer(ymd("2020-04-13") - ymd(sir_start_date)) # get the fitted values from our SIR model  fitted_cumulative_incidence <- data.frame(ode(    y = init, times = t,    func = SIR, parms = Opt_par  )) # add a Date column and the observed incidence data  library(dplyr)  fitted_cumulative_incidence <- fitted_cumulative_incidence %>%    mutate(      Date = ymd(sir_start_date) + days(t - 1),      Country = "Cabo Verde",      cumulative_incident_cases = Infected    ) # plot the data  library(ggplot2)  fitted_cumulative_incidence %>%    ggplot(aes(x = Date)) +    geom_line(aes(y = I), colour = "red") +    geom_point(aes(y = cumulative_incident_cases), colour = "blue") +    labs(      y = "Cumulative incidence",      title = "COVID-19 fitted vs observed cumulative incidence, Belgium",      subtitle = "(Red = fitted incidence from SIR model, blue = observed incidence)"    ) +    theme_minimal() fitted_cumulative_incidence %>%    ggplot(aes(x = Date)) +    geom_line(aes(y = I), colour = "red") +    geom_point(aes(y = cumulative_incident_cases), colour = "blue") +    labs(      y = "Cumulative incidence",      title = "COVID-19 fitted vs observed cumulative incidence, Belgium",      subtitle = "(Red = fitted incidence from SIR model, blue = observed incidence)"    ) +    theme_minimal() +    scale_y_log10(labels = scales::comma) #We can compute it in R:   Opt_par    R0 <- as.numeric(Opt_par[1] / Opt_par[2])    R0   # time in days for predictions    t <- 1:120   # get the fitted values from our SIR model    fitted_cumulative_incidence <- data.frame(ode(      y = init, times = t,      func = SIR, parms = Opt_par    ))   # add a Date column and join the observed incidence data    fitted_cumulative_incidence <- fitted_cumulative_incidence %>%      mutate(        Date = ymd(sir_start_date) + days(t - 1),        Country = "Cabo Verde",        cumulative_incident_cases = c(Infected, rep(NA, length(t) - length(Infected)))      )   # plot the data    fitted_cumulative_incidence %>%      ggplot(aes(x = Date)) +      geom_line(aes(y = I), colour = "red") +      geom_line(aes(y = S), colour = "black") +      geom_line(aes(y = R), colour = "green") +      geom_point(aes(y = cumulative_incident_cases),                 colour = "blue"      ) +      scale_y_continuous(labels = scales::comma) +      labs(y = "Persons", title = "COVID-19 fitted vs observed cumulative incidence, Belgium") +      scale_colour_manual(name = "", values = c(        red = "red", black = "black",        green = "green", blue = "blue"      ), labels = c(        "Susceptible",        "Recovered", "Observed incidence", "Infectious"      )) +      theme_minimal()   # plot the data    fitted_cumulative_incidence %>%      ggplot(aes(x = Date)) +      geom_line(aes(y = I, colour = "red")) +      geom_line(aes(y = S, colour = "black")) +      geom_line(aes(y = R, colour = "green")) +      geom_point(aes(y = cumulative_incident_cases, colour = "blue")) +      scale_y_log10(labels = scales::comma) +      labs(        y = "Persons",        title = "COVID-19 fitted vs observed cumulative incidence, Belgium"      ) +      scale_colour_manual(        name = "",        values = c(red = "red", black = "black", green = "green", blue = "blue"),        labels = c("Susceptible", "Observed incidence", "Recovered", "Infectious")      ) +      theme_minimal()   fit <- fitted_cumulative_incidence   # peak of pandemic    fit[fit$I == max(fit$I), c("Date", "I")]   # severe cases    max_infected <- max(fit$I)    max_infected / 5   # cases with need for intensive care    max_infected 0.06      # deaths with supposed 0.7% fatality rate    max_infected 0.007

Comment written by Antoine Soetewey on April 16, 2020 09:08:47: Dear José, To update the dataset, you need to use the update_datasets() function or reinstall the {coronavirus} package with devtools::install_github("RamiKrispin/coronavirus").  After doing this, check that you have the latest available data before running your code. Hope this helps. Regards, Antoine

Comment written by José Moniz Fernandes on April 17, 2020 00:59:20: Thanks.

Comment written by José Moniz Fernandes on April 17, 2020 01:12:08: Why # severe cases  max_infected <- max(fit$I)  max_infected / 5 or why five (5)?

Comment written by José Moniz Fernandes on April 17, 2020 01:14:21:

Why   # cases with need for intensive care  max_infected * 0.06  ??

Comment written by José Moniz Fernandes on April 17, 2020 01:14:58:

Why?    # deaths with supposed 0.7% fatality rate    max_infected * 0.007 ????  Where do have this number for my country?

AntoineSoetewey commented 3 years ago

Comment written by José Moniz Fernandes on April 16, 2020 01:01:45: Hi,  Other problem. Why the data isn't update?  I apply for Cabo Verde data the code https://statsandr.com/blog/covid-19-in-belgium/ SIR <- function(time, state, parameters) {    par <- as.list(c(state, parameters))    with(par, {      dS <- -beta I S / N      dI <- beta I S / N - gamma I      dR <- gamma I      list(c(dS, dI, dR))    })  } #to create a vector with the daily cumulative incidence for Belgium, from February 4 (when our daily incidence data starts)  # devtools::install_github("RamiKrispin/coronavirus")  library(coronavirus)  data(coronavirus) %&gt;% <- magrittr::%&gt;% # extract the cumulative incidence  df <- coronavirus %>%    dplyr::filter(Country.Region == "Cabo Verde") %>%    dplyr::group_by(date, type) %>%    dplyr::summarise(total = sum(cases, na.rm = TRUE)) %>%    tidyr::pivot_wider(      names_from = type,      values_from = total    ) %>%    dplyr::arrange(date) %>%    dplyr::ungroup() %>%    dplyr::mutate(active = confirmed - death - recovered) %>%    dplyr::mutate(      confirmed_cum = cumsum(confirmed),      death_cum = cumsum(death),      recovered_cum = cumsum(recovered),      active_cum = cumsum(active)    ) # put the daily cumulative incidence numbers for Belgium from  # Feb 4 to March 30 into a vector called Infected  library(lubridate)  Infected <- subset(df, date >= ymd("2020-03-20") & date <= ymd("2020-04-12"))$active_cum # Create an incrementing Day vector the same length as our  # cases vector  Day <- 1:(length(Infected)) # now specify initial values for N, S, I and R  N <- 556586  init <- c(    S = N - Infected[1],    I = Infected[1],    R = 0  ) # define a function to calculate the residual sum of squares  # (RSS), passing in parameters beta and gamma that are to be  # optimised for the best fit to the incidence data  RSS <- function(parameters) {    names(parameters) <- c("beta", "gamma")    out <- ode(y = init, times = Day, func = SIR, parms = parameters)    fit <- out[, 3]    sum((Infected - fit)^2)  } # now find the values of beta and gamma that give the  # smallest RSS, which represents the best fit to the data.  # Start with values of 0.5 for each, and constrain them to  # the interval 0 to 1.0 # install.packages("deSolve")  library(deSolve) Opt <- optim(c(0.5, 0.5),               RSS,               method = "L-BFGS-B",               lower = c(0, 0),               upper = c(1, 1)  ) # check for convergence  Opt$message Opt_par <- setNames(Opt$par, c("beta", "gamma"))  Opt_par sir_start_date <- "2020-03-20" # time in days for predictions  t <- 1:as.integer(ymd("2020-04-13") - ymd(sir_start_date)) # get the fitted values from our SIR model  fitted_cumulative_incidence <- data.frame(ode(    y = init, times = t,    func = SIR, parms = Opt_par  )) # add a Date column and the observed incidence data  library(dplyr)  fitted_cumulative_incidence <- fitted_cumulative_incidence %>%    mutate(      Date = ymd(sir_start_date) + days(t - 1),      Country = "Cabo Verde",      cumulative_incident_cases = Infected    ) # plot the data  library(ggplot2)  fitted_cumulative_incidence %>%    ggplot(aes(x = Date)) +    geom_line(aes(y = I), colour = "red") +    geom_point(aes(y = cumulative_incident_cases), colour = "blue") +    labs(      y = "Cumulative incidence",      title = "COVID-19 fitted vs observed cumulative incidence, Belgium",      subtitle = "(Red = fitted incidence from SIR model, blue = observed incidence)"    ) +    theme_minimal() fitted_cumulative_incidence %>%    ggplot(aes(x = Date)) +    geom_line(aes(y = I), colour = "red") +    geom_point(aes(y = cumulative_incident_cases), colour = "blue") +    labs(      y = "Cumulative incidence",      title = "COVID-19 fitted vs observed cumulative incidence, Belgium",      subtitle = "(Red = fitted incidence from SIR model, blue = observed incidence)"    ) +    theme_minimal() +    scale_y_log10(labels = scales::comma) #We can compute it in R:   Opt_par    R0 <- as.numeric(Opt_par[1] / Opt_par[2])    R0   # time in days for predictions    t <- 1:120   # get the fitted values from our SIR model    fitted_cumulative_incidence <- data.frame(ode(      y = init, times = t,      func = SIR, parms = Opt_par    ))   # add a Date column and join the observed incidence data    fitted_cumulative_incidence <- fitted_cumulative_incidence %>%      mutate(        Date = ymd(sir_start_date) + days(t - 1),        Country = "Cabo Verde",        cumulative_incident_cases = c(Infected, rep(NA, length(t) - length(Infected)))      )   # plot the data    fitted_cumulative_incidence %>%      ggplot(aes(x = Date)) +      geom_line(aes(y = I), colour = "red") +      geom_line(aes(y = S), colour = "black") +      geom_line(aes(y = R), colour = "green") +      geom_point(aes(y = cumulative_incident_cases),                 colour = "blue"      ) +      scale_y_continuous(labels = scales::comma) +      labs(y = "Persons", title = "COVID-19 fitted vs observed cumulative incidence, Belgium") +      scale_colour_manual(name = "", values = c(        red = "red", black = "black",        green = "green", blue = "blue"      ), labels = c(        "Susceptible",        "Recovered", "Observed incidence", "Infectious"      )) +      theme_minimal()   # plot the data    fitted_cumulative_incidence %>%      ggplot(aes(x = Date)) +      geom_line(aes(y = I, colour = "red")) +      geom_line(aes(y = S, colour = "black")) +      geom_line(aes(y = R, colour = "green")) +      geom_point(aes(y = cumulative_incident_cases, colour = "blue")) +      scale_y_log10(labels = scales::comma) +      labs(        y = "Persons",        title = "COVID-19 fitted vs observed cumulative incidence, Belgium"      ) +      scale_colour_manual(        name = "",        values = c(red = "red", black = "black", green = "green", blue = "blue"),        labels = c("Susceptible", "Observed incidence", "Recovered", "Infectious")      ) +      theme_minimal()   fit <- fitted_cumulative_incidence   # peak of pandemic    fit[fit$I == max(fit$I), c("Date", "I")]   # severe cases    max_infected <- max(fit$I)    max_infected / 5   # cases with need for intensive care    max_infected 0.06      # deaths with supposed 0.7% fatality rate    max_infected 0.007

Comment written by Antoine Soetewey on April 16, 2020 09:08:47: Dear José, To update the dataset, you need to use the update_datasets() function or reinstall the {coronavirus} package with devtools::install_github("RamiKrispin/coronavirus").  After doing this, check that you have the latest available data before running your code. Hope this helps. Regards, Antoine

Comment written by José Moniz Fernandes on April 17, 2020 00:59:20: Thanks.

Comment written by José Moniz Fernandes on April 17, 2020 01:12:08: Why # severe cases  max_infected <- max(fit$I)  max_infected / 5 or why five (5)?

Comment written by José Moniz Fernandes on April 17, 2020 01:14:21: Why   # cases with need for intensive care  max_infected * 0.06  ??

Comment written by José Moniz Fernandes on April 17, 2020 01:14:58:

Why?    # deaths with supposed 0.7% fatality rate    max_infected * 0.007 ????  Where do have this number for my country?

Comment written by Antoine Soetewey on April 17, 2020 07:26:25:

I'll reply to all of your comments in one comment:

As written in the post, the 0.7% fatality rate comes from this source. For your country, simply take the same figure or do some research to estimate the fatality rate in your country.

Regarding severe cases: dividing by 5 is like multiplying by 0.2, and 20% means that 1 out of every 5 cases is considered as severe case. 6% and 20% of infected cases for intensive care and severe cases (respectively) are estimates. As I already said in another comment, they may be different in your country. For more information where I found these estimates, see the end of this article where the author applies the analyses to Germany.

As for fatality rate, if you believe that these estimates are wrong (and it can be the case for countries other than Belgium), do some research and try to find better estimates for your own country.

PS: for your information, typing "????" does not add any value compared to "?", except that it looks aggressive.

Regards, Antoine

AntoineSoetewey commented 3 years ago

Comment written by Mohammad Abdullah on April 16, 2020 14:36:56: Thank you Anton for this remarkable work I congratulate you about it ,  I have some questions regarding :  Q.1 max_infected / 5? (5) represents what exactly is it (5) Months??  Q.2 # cases with need for intensive care  max_infected * 0.06 ? the number (0.06) what represents?  Q.3 How can we calculate the fatality rate?

Comment written by Antoine Soetewey on April 16, 2020 14:48:13: Thanks for your feedback Mohammad. Q.1 one out of every five cases is considered severe (it's equivalent than max_infected * 0.2)  Q.2 6% of the cases are expected to be in need of intensive care  Q.3 again, we use a 0.7% fatality rate so we multiply by 0.007 These 3 figures were chosen to be in line with other resources I have found online. However, they may be different in your country (and may even vary for Belgium in the future) so if you believe that they are under or overestimated you can choose your own. Hope this helps.  Regards, Antoine

Comment written by Mohammad Abdullah on April 16, 2020 18:30:50: Hi Again Antoin, thank you for your first question answering. Please look at the data of my country (saudi Arabia) : https://drive.google.com/drive/folders/16ZZIKnZ71Jsk2vloxSbRHg24l3vw_mio?usp=sharing  I have problems as the following  

  1. the date is not sorting correctly and the accumilated value have the wrong positon in the correct place of the date after doing the accumilation process.  e.g.: the final accumilated confirmed cases= 4462 should be in the date=12-April-2020 but it write it on (31-March-2020)
  2. I have problem In the date when I convert it to (ymd) format, so always (Infected) = 0 , e.g > Infected  numeric(0) The code I wrote it here: DataSa<-as.data.frame(Saudi_Arabia)  #OR  DataSa<-as.data.frame(Saudi_Arabia_txt)  #DataSa$date<-as.Date(as.character(DataSa$date), format = "%d")  #df<-as.data.frame(DataSa)  #devtools::install_github("RamiKrispin/coronavirus")  #library(coronavirus)  #data(coronavirus)  #data(DataSa)  # extract the cumulative incidence  df1 <- DataSa %>%    dplyr::filter(Province.State == "Saudi Arabia") %>%    dplyr::group_by(date, type) %>%    dplyr::summarise(total = sum(cases, na.rm = TRUE)) %>%    tidyr::pivot_wider(      names_from = type,      values_from = total    ) %>%    dplyr::arrange(date) %>%    dplyr::ungroup() %>%    dplyr::mutate(active = confirmed - death - recovered) %>%    dplyr::mutate(      confirmed_cum = cumsum(confirmed),      death_cum = cumsum(death),      recovered_cum = cumsum(recovered),      active_cum = cumsum(active)    ) # put the daily cumulative incidence numbers for Belgium from library(lubridate)  Infected <- subset(df1, date >= ymd("2020-03-02") & date <= ymd("2020-04-06"))$active_cum  #OR  #Infected <- subset(df1, date >= dym("02-03-2020") & date <= dym("06-04-2020"))$active_cum

Please advise.

Comment written by Mohammad Abdullah on April 16, 2020 18:33:39:

Hi Antonie

Here is the link of my Csv , Txt file: https://drive.google.com/open?id=16ZZIKnZ71Jsk2vloxSbRHg24l3vw_mio

Comment written by Mohammad Abdullah on April 17, 2020 12:01:57:

Hi Antoine, I just send you message please replay to my email. 
Thank you

AntoineSoetewey commented 3 years ago

Comment written by José Moniz Fernandes on April 16, 2020 01:01:45: Hi,  Other problem. Why the data isn't update?  I apply for Cabo Verde data the code https://statsandr.com/blog/covid-19-in-belgium/ SIR <- function(time, state, parameters) {    par <- as.list(c(state, parameters))    with(par, {      dS <- -beta I S / N      dI <- beta I S / N - gamma I      dR <- gamma I      list(c(dS, dI, dR))    })  } #to create a vector with the daily cumulative incidence for Belgium, from February 4 (when our daily incidence data starts)  # devtools::install_github("RamiKrispin/coronavirus")  library(coronavirus)  data(coronavirus) %&gt;% <- magrittr::%&gt;% # extract the cumulative incidence  df <- coronavirus %>%    dplyr::filter(Country.Region == "Cabo Verde") %>%    dplyr::group_by(date, type) %>%    dplyr::summarise(total = sum(cases, na.rm = TRUE)) %>%    tidyr::pivot_wider(      names_from = type,      values_from = total    ) %>%    dplyr::arrange(date) %>%    dplyr::ungroup() %>%    dplyr::mutate(active = confirmed - death - recovered) %>%    dplyr::mutate(      confirmed_cum = cumsum(confirmed),      death_cum = cumsum(death),      recovered_cum = cumsum(recovered),      active_cum = cumsum(active)    ) # put the daily cumulative incidence numbers for Belgium from  # Feb 4 to March 30 into a vector called Infected  library(lubridate)  Infected <- subset(df, date >= ymd("2020-03-20") & date <= ymd("2020-04-12"))$active_cum # Create an incrementing Day vector the same length as our  # cases vector  Day <- 1:(length(Infected)) # now specify initial values for N, S, I and R  N <- 556586  init <- c(    S = N - Infected[1],    I = Infected[1],    R = 0  ) # define a function to calculate the residual sum of squares  # (RSS), passing in parameters beta and gamma that are to be  # optimised for the best fit to the incidence data  RSS <- function(parameters) {    names(parameters) <- c("beta", "gamma")    out <- ode(y = init, times = Day, func = SIR, parms = parameters)    fit <- out[, 3]    sum((Infected - fit)^2)  } # now find the values of beta and gamma that give the  # smallest RSS, which represents the best fit to the data.  # Start with values of 0.5 for each, and constrain them to  # the interval 0 to 1.0 # install.packages("deSolve")  library(deSolve) Opt <- optim(c(0.5, 0.5),               RSS,               method = "L-BFGS-B",               lower = c(0, 0),               upper = c(1, 1)  ) # check for convergence  Opt$message Opt_par <- setNames(Opt$par, c("beta", "gamma"))  Opt_par sir_start_date <- "2020-03-20" # time in days for predictions  t <- 1:as.integer(ymd("2020-04-13") - ymd(sir_start_date)) # get the fitted values from our SIR model  fitted_cumulative_incidence <- data.frame(ode(    y = init, times = t,    func = SIR, parms = Opt_par  )) # add a Date column and the observed incidence data  library(dplyr)  fitted_cumulative_incidence <- fitted_cumulative_incidence %>%    mutate(      Date = ymd(sir_start_date) + days(t - 1),      Country = "Cabo Verde",      cumulative_incident_cases = Infected    ) # plot the data  library(ggplot2)  fitted_cumulative_incidence %>%    ggplot(aes(x = Date)) +    geom_line(aes(y = I), colour = "red") +    geom_point(aes(y = cumulative_incident_cases), colour = "blue") +    labs(      y = "Cumulative incidence",      title = "COVID-19 fitted vs observed cumulative incidence, Belgium",      subtitle = "(Red = fitted incidence from SIR model, blue = observed incidence)"    ) +    theme_minimal() fitted_cumulative_incidence %>%    ggplot(aes(x = Date)) +    geom_line(aes(y = I), colour = "red") +    geom_point(aes(y = cumulative_incident_cases), colour = "blue") +    labs(      y = "Cumulative incidence",      title = "COVID-19 fitted vs observed cumulative incidence, Belgium",      subtitle = "(Red = fitted incidence from SIR model, blue = observed incidence)"    ) +    theme_minimal() +    scale_y_log10(labels = scales::comma) #We can compute it in R:   Opt_par    R0 <- as.numeric(Opt_par[1] / Opt_par[2])    R0   # time in days for predictions    t <- 1:120   # get the fitted values from our SIR model    fitted_cumulative_incidence <- data.frame(ode(      y = init, times = t,      func = SIR, parms = Opt_par    ))   # add a Date column and join the observed incidence data    fitted_cumulative_incidence <- fitted_cumulative_incidence %>%      mutate(        Date = ymd(sir_start_date) + days(t - 1),        Country = "Cabo Verde",        cumulative_incident_cases = c(Infected, rep(NA, length(t) - length(Infected)))      )   # plot the data    fitted_cumulative_incidence %>%      ggplot(aes(x = Date)) +      geom_line(aes(y = I), colour = "red") +      geom_line(aes(y = S), colour = "black") +      geom_line(aes(y = R), colour = "green") +      geom_point(aes(y = cumulative_incident_cases),                 colour = "blue"      ) +      scale_y_continuous(labels = scales::comma) +      labs(y = "Persons", title = "COVID-19 fitted vs observed cumulative incidence, Belgium") +      scale_colour_manual(name = "", values = c(        red = "red", black = "black",        green = "green", blue = "blue"      ), labels = c(        "Susceptible",        "Recovered", "Observed incidence", "Infectious"      )) +      theme_minimal()   # plot the data    fitted_cumulative_incidence %>%      ggplot(aes(x = Date)) +      geom_line(aes(y = I, colour = "red")) +      geom_line(aes(y = S, colour = "black")) +      geom_line(aes(y = R, colour = "green")) +      geom_point(aes(y = cumulative_incident_cases, colour = "blue")) +      scale_y_log10(labels = scales::comma) +      labs(        y = "Persons",        title = "COVID-19 fitted vs observed cumulative incidence, Belgium"      ) +      scale_colour_manual(        name = "",        values = c(red = "red", black = "black", green = "green", blue = "blue"),        labels = c("Susceptible", "Observed incidence", "Recovered", "Infectious")      ) +      theme_minimal()   fit <- fitted_cumulative_incidence   # peak of pandemic    fit[fit$I == max(fit$I), c("Date", "I")]   # severe cases    max_infected <- max(fit$I)    max_infected / 5   # cases with need for intensive care    max_infected 0.06      # deaths with supposed 0.7% fatality rate    max_infected 0.007

Comment written by Antoine Soetewey on April 16, 2020 09:08:47: Dear José, To update the dataset, you need to use the update_datasets() function or reinstall the {coronavirus} package with devtools::install_github("RamiKrispin/coronavirus").  After doing this, check that you have the latest available data before running your code. Hope this helps. Regards, Antoine

Comment written by José Moniz Fernandes on April 17, 2020 00:59:20: Thanks.

Comment written by José Moniz Fernandes on April 17, 2020 01:12:08: Why # severe cases  max_infected <- max(fit$I)  max_infected / 5 or why five (5)?

Comment written by José Moniz Fernandes on April 17, 2020 01:14:21: Why   # cases with need for intensive care  max_infected * 0.06  ??

Comment written by José Moniz Fernandes on April 17, 2020 01:14:58: Why?    # deaths with supposed 0.7% fatality rate    max_infected * 0.007 ????  Where do have this number for my country?

Comment written by Antoine Soetewey on April 17, 2020 07:26:25:

I'll reply to all of your comments in one comment:

As written in the post, the 0.7% fatality rate comes from this source. For your country, simply take the same figure or do some research to estimate the fatality rate in your country.

Regarding severe cases: dividing by 5 is like multiplying by 0.2, and 20% means that 1 out of every 5 cases is considered as severe case. 6% and 20% of infected cases for intensive care and severe cases (respectively) are estimates. As I already said in another comment, they may be different in your country. For more information where I found these estimates, see the end of this article where the author applies the analyses to Germany.

As for fatality rate, if you believe that these estimates are wrong (and it can be the case for countries other than Belgium), do some research and try to find better estimates for your own country.

PS: for your information, typing "????" does not add any value compared to "?", except that it looks aggressive.

Regards, Antoine

Comment written by José Moniz Fernandes on April 17, 2020 14:27:12:

I understood.

In Cabo Verde is   # deaths with supposed 1.8% fatality rate  max_infected * 0.018.

Thanks. 

Where control I the first data in the first plot (summary cases), for example, the first confirmed case is 20/03/20. I don't need  before the 15 of mars.

AntoineSoetewey commented 3 years ago

Comment written by Mohammad Abdullah on April 16, 2020 14:36:56: Thank you Anton for this remarkable work I congratulate you about it ,  I have some questions regarding :  Q.1 max_infected / 5? (5) represents what exactly is it (5) Months??  Q.2 # cases with need for intensive care  max_infected * 0.06 ? the number (0.06) what represents?  Q.3 How can we calculate the fatality rate?

Comment written by Antoine Soetewey on April 16, 2020 14:48:13: Thanks for your feedback Mohammad. Q.1 one out of every five cases is considered severe (it's equivalent than max_infected * 0.2)  Q.2 6% of the cases are expected to be in need of intensive care  Q.3 again, we use a 0.7% fatality rate so we multiply by 0.007 These 3 figures were chosen to be in line with other resources I have found online. However, they may be different in your country (and may even vary for Belgium in the future) so if you believe that they are under or overestimated you can choose your own. Hope this helps.  Regards, Antoine

Comment written by Mohammad Abdullah on April 16, 2020 18:30:50: Hi Again Antoin, thank you for your first question answering. Please look at the data of my country (saudi Arabia) : https://drive.google.com/drive/folders/16ZZIKnZ71Jsk2vloxSbRHg24l3vw_mio?usp=sharing  I have problems as the following  

  1. the date is not sorting correctly and the accumilated value have the wrong positon in the correct place of the date after doing the accumilation process.  e.g.: the final accumilated confirmed cases= 4462 should be in the date=12-April-2020 but it write it on (31-March-2020)
  2. I have problem In the date when I convert it to (ymd) format, so always (Infected) = 0 , e.g > Infected  numeric(0) The code I wrote it here: DataSa<-as.data.frame(Saudi_Arabia)  #OR  DataSa<-as.data.frame(Saudi_Arabia_txt)  #DataSa$date<-as.Date(as.character(DataSa$date), format = "%d")  #df<-as.data.frame(DataSa)  #devtools::install_github("RamiKrispin/coronavirus")  #library(coronavirus)  #data(coronavirus)  #data(DataSa)  # extract the cumulative incidence  df1 <- DataSa %>%    dplyr::filter(Province.State == "Saudi Arabia") %>%    dplyr::group_by(date, type) %>%    dplyr::summarise(total = sum(cases, na.rm = TRUE)) %>%    tidyr::pivot_wider(      names_from = type,      values_from = total    ) %>%    dplyr::arrange(date) %>%    dplyr::ungroup() %>%    dplyr::mutate(active = confirmed - death - recovered) %>%    dplyr::mutate(      confirmed_cum = cumsum(confirmed),      death_cum = cumsum(death),      recovered_cum = cumsum(recovered),      active_cum = cumsum(active)    ) # put the daily cumulative incidence numbers for Belgium from library(lubridate)  Infected <- subset(df1, date >= ymd("2020-03-02") & date <= ymd("2020-04-06"))$active_cum  #OR  #Infected <- subset(df1, date >= dym("02-03-2020") & date <= dym("06-04-2020"))$active_cum

Please advise.

Comment written by Mohammad Abdullah on April 16, 2020 18:33:39: Hi Antonie Here is the link of my Csv , Txt file: https://drive.google.com/open?id=16ZZIKnZ71Jsk2vloxSbRHg24l3vw_mio

Comment written by Mohammad Abdullah on April 17, 2020 12:01:57:

Hi Antoine, I just send you message please replay to my email.  Thank you

Comment written by Mohammad Abdullah on April 17, 2020 15:45:31:

Hi Antonie 

I got the following error:

> fitted_cumulative_incidence <- fitted_cumulative_incidence %>%  +   mutate(  +     Date = ymd(sir_start_date) + days(t - 1),  +     Country = "Saudi Arabia",  +     cumulative_incident_cases = Infected  +   )  Error: Column cumulative_incident_cases must be length 27 (the number of rows) or one, not 56

Please advise.

AntoineSoetewey commented 3 years ago

Comment written by Mohammad Abdullah on April 16, 2020 14:36:56: Thank you Anton for this remarkable work I congratulate you about it ,  I have some questions regarding :  Q.1 max_infected / 5? (5) represents what exactly is it (5) Months??  Q.2 # cases with need for intensive care  max_infected * 0.06 ? the number (0.06) what represents?  Q.3 How can we calculate the fatality rate?

Comment written by Antoine Soetewey on April 16, 2020 14:48:13: Thanks for your feedback Mohammad. Q.1 one out of every five cases is considered severe (it's equivalent than max_infected * 0.2)  Q.2 6% of the cases are expected to be in need of intensive care  Q.3 again, we use a 0.7% fatality rate so we multiply by 0.007 These 3 figures were chosen to be in line with other resources I have found online. However, they may be different in your country (and may even vary for Belgium in the future) so if you believe that they are under or overestimated you can choose your own. Hope this helps.  Regards, Antoine

Comment written by Mohammad Abdullah on April 16, 2020 18:30:50: Hi Again Antoin, thank you for your first question answering. Please look at the data of my country (saudi Arabia) : https://drive.google.com/drive/folders/16ZZIKnZ71Jsk2vloxSbRHg24l3vw_mio?usp=sharing  I have problems as the following  

  1. the date is not sorting correctly and the accumilated value have the wrong positon in the correct place of the date after doing the accumilation process.  e.g.: the final accumilated confirmed cases= 4462 should be in the date=12-April-2020 but it write it on (31-March-2020)
  2. I have problem In the date when I convert it to (ymd) format, so always (Infected) = 0 , e.g > Infected  numeric(0) The code I wrote it here: DataSa<-as.data.frame(Saudi_Arabia)  #OR  DataSa<-as.data.frame(Saudi_Arabia_txt)  #DataSa$date<-as.Date(as.character(DataSa$date), format = "%d")  #df<-as.data.frame(DataSa)  #devtools::install_github("RamiKrispin/coronavirus")  #library(coronavirus)  #data(coronavirus)  #data(DataSa)  # extract the cumulative incidence  df1 <- DataSa %>%    dplyr::filter(Province.State == "Saudi Arabia") %>%    dplyr::group_by(date, type) %>%    dplyr::summarise(total = sum(cases, na.rm = TRUE)) %>%    tidyr::pivot_wider(      names_from = type,      values_from = total    ) %>%    dplyr::arrange(date) %>%    dplyr::ungroup() %>%    dplyr::mutate(active = confirmed - death - recovered) %>%    dplyr::mutate(      confirmed_cum = cumsum(confirmed),      death_cum = cumsum(death),      recovered_cum = cumsum(recovered),      active_cum = cumsum(active)    ) # put the daily cumulative incidence numbers for Belgium from library(lubridate)  Infected <- subset(df1, date >= ymd("2020-03-02") & date <= ymd("2020-04-06"))$active_cum  #OR  #Infected <- subset(df1, date >= dym("02-03-2020") & date <= dym("06-04-2020"))$active_cum

Please advise.

Comment written by Mohammad Abdullah on April 16, 2020 18:33:39: Hi Antonie Here is the link of my Csv , Txt file: https://drive.google.com/open?id=16ZZIKnZ71Jsk2vloxSbRHg24l3vw_mio

Comment written by Mohammad Abdullah on April 17, 2020 12:01:57: Hi Antoine, I just send you message please replay to my email.  Thank you

Comment written by Mohammad Abdullah on April 17, 2020 15:45:31:

Hi Antonie 

I got the following error:

fitted_cumulative_incidence <- fitted_cumulative_incidence %>%  +   mutate(  +     Date = ymd(sir_start_date) + days(t - 1),  +     Country = "Saudi Arabia",  +     cumulative_incident_cases = Infected  +   )  Error: Column cumulative_incident_cases must be length 27 (the number of rows) or one, not 56

Please advise.

Comment written by Antoine Soetewey on April 17, 2020 15:56:42:

The error is due to the fact that the length of the Infected vector is not equal to the number of rows of the column cumulative_incident_cases.

AntoineSoetewey commented 3 years ago

Comment written by José Moniz Fernandes on April 16, 2020 01:01:45: Hi,  Other problem. Why the data isn't update?  I apply for Cabo Verde data the code https://statsandr.com/blog/covid-19-in-belgium/ SIR <- function(time, state, parameters) {    par <- as.list(c(state, parameters))    with(par, {      dS <- -beta I S / N      dI <- beta I S / N - gamma I      dR <- gamma I      list(c(dS, dI, dR))    })  } #to create a vector with the daily cumulative incidence for Belgium, from February 4 (when our daily incidence data starts)  # devtools::install_github("RamiKrispin/coronavirus")  library(coronavirus)  data(coronavirus) %&gt;% <- magrittr::%&gt;% # extract the cumulative incidence  df <- coronavirus %>%    dplyr::filter(Country.Region == "Cabo Verde") %>%    dplyr::group_by(date, type) %>%    dplyr::summarise(total = sum(cases, na.rm = TRUE)) %>%    tidyr::pivot_wider(      names_from = type,      values_from = total    ) %>%    dplyr::arrange(date) %>%    dplyr::ungroup() %>%    dplyr::mutate(active = confirmed - death - recovered) %>%    dplyr::mutate(      confirmed_cum = cumsum(confirmed),      death_cum = cumsum(death),      recovered_cum = cumsum(recovered),      active_cum = cumsum(active)    ) # put the daily cumulative incidence numbers for Belgium from  # Feb 4 to March 30 into a vector called Infected  library(lubridate)  Infected <- subset(df, date >= ymd("2020-03-20") & date <= ymd("2020-04-12"))$active_cum # Create an incrementing Day vector the same length as our  # cases vector  Day <- 1:(length(Infected)) # now specify initial values for N, S, I and R  N <- 556586  init <- c(    S = N - Infected[1],    I = Infected[1],    R = 0  ) # define a function to calculate the residual sum of squares  # (RSS), passing in parameters beta and gamma that are to be  # optimised for the best fit to the incidence data  RSS <- function(parameters) {    names(parameters) <- c("beta", "gamma")    out <- ode(y = init, times = Day, func = SIR, parms = parameters)    fit <- out[, 3]    sum((Infected - fit)^2)  } # now find the values of beta and gamma that give the  # smallest RSS, which represents the best fit to the data.  # Start with values of 0.5 for each, and constrain them to  # the interval 0 to 1.0 # install.packages("deSolve")  library(deSolve) Opt <- optim(c(0.5, 0.5),               RSS,               method = "L-BFGS-B",               lower = c(0, 0),               upper = c(1, 1)  ) # check for convergence  Opt$message Opt_par <- setNames(Opt$par, c("beta", "gamma"))  Opt_par sir_start_date <- "2020-03-20" # time in days for predictions  t <- 1:as.integer(ymd("2020-04-13") - ymd(sir_start_date)) # get the fitted values from our SIR model  fitted_cumulative_incidence <- data.frame(ode(    y = init, times = t,    func = SIR, parms = Opt_par  )) # add a Date column and the observed incidence data  library(dplyr)  fitted_cumulative_incidence <- fitted_cumulative_incidence %>%    mutate(      Date = ymd(sir_start_date) + days(t - 1),      Country = "Cabo Verde",      cumulative_incident_cases = Infected    ) # plot the data  library(ggplot2)  fitted_cumulative_incidence %>%    ggplot(aes(x = Date)) +    geom_line(aes(y = I), colour = "red") +    geom_point(aes(y = cumulative_incident_cases), colour = "blue") +    labs(      y = "Cumulative incidence",      title = "COVID-19 fitted vs observed cumulative incidence, Belgium",      subtitle = "(Red = fitted incidence from SIR model, blue = observed incidence)"    ) +    theme_minimal() fitted_cumulative_incidence %>%    ggplot(aes(x = Date)) +    geom_line(aes(y = I), colour = "red") +    geom_point(aes(y = cumulative_incident_cases), colour = "blue") +    labs(      y = "Cumulative incidence",      title = "COVID-19 fitted vs observed cumulative incidence, Belgium",      subtitle = "(Red = fitted incidence from SIR model, blue = observed incidence)"    ) +    theme_minimal() +    scale_y_log10(labels = scales::comma) #We can compute it in R:   Opt_par    R0 <- as.numeric(Opt_par[1] / Opt_par[2])    R0   # time in days for predictions    t <- 1:120   # get the fitted values from our SIR model    fitted_cumulative_incidence <- data.frame(ode(      y = init, times = t,      func = SIR, parms = Opt_par    ))   # add a Date column and join the observed incidence data    fitted_cumulative_incidence <- fitted_cumulative_incidence %>%      mutate(        Date = ymd(sir_start_date) + days(t - 1),        Country = "Cabo Verde",        cumulative_incident_cases = c(Infected, rep(NA, length(t) - length(Infected)))      )   # plot the data    fitted_cumulative_incidence %>%      ggplot(aes(x = Date)) +      geom_line(aes(y = I), colour = "red") +      geom_line(aes(y = S), colour = "black") +      geom_line(aes(y = R), colour = "green") +      geom_point(aes(y = cumulative_incident_cases),                 colour = "blue"      ) +      scale_y_continuous(labels = scales::comma) +      labs(y = "Persons", title = "COVID-19 fitted vs observed cumulative incidence, Belgium") +      scale_colour_manual(name = "", values = c(        red = "red", black = "black",        green = "green", blue = "blue"      ), labels = c(        "Susceptible",        "Recovered", "Observed incidence", "Infectious"      )) +      theme_minimal()   # plot the data    fitted_cumulative_incidence %>%      ggplot(aes(x = Date)) +      geom_line(aes(y = I, colour = "red")) +      geom_line(aes(y = S, colour = "black")) +      geom_line(aes(y = R, colour = "green")) +      geom_point(aes(y = cumulative_incident_cases, colour = "blue")) +      scale_y_log10(labels = scales::comma) +      labs(        y = "Persons",        title = "COVID-19 fitted vs observed cumulative incidence, Belgium"      ) +      scale_colour_manual(        name = "",        values = c(red = "red", black = "black", green = "green", blue = "blue"),        labels = c("Susceptible", "Observed incidence", "Recovered", "Infectious")      ) +      theme_minimal()   fit <- fitted_cumulative_incidence   # peak of pandemic    fit[fit$I == max(fit$I), c("Date", "I")]   # severe cases    max_infected <- max(fit$I)    max_infected / 5   # cases with need for intensive care    max_infected 0.06      # deaths with supposed 0.7% fatality rate    max_infected 0.007

Comment written by Antoine Soetewey on April 16, 2020 09:08:47: Dear José, To update the dataset, you need to use the update_datasets() function or reinstall the {coronavirus} package with devtools::install_github("RamiKrispin/coronavirus").  After doing this, check that you have the latest available data before running your code. Hope this helps. Regards, Antoine

Comment written by José Moniz Fernandes on April 17, 2020 00:59:20: Thanks.

Comment written by José Moniz Fernandes on April 17, 2020 01:12:08: Why # severe cases  max_infected <- max(fit$I)  max_infected / 5 or why five (5)?

Comment written by José Moniz Fernandes on April 17, 2020 01:14:21: Why   # cases with need for intensive care  max_infected * 0.06  ??

Comment written by José Moniz Fernandes on April 17, 2020 01:14:58: Why?    # deaths with supposed 0.7% fatality rate    max_infected * 0.007 ????  Where do have this number for my country?

Comment written by Antoine Soetewey on April 17, 2020 07:26:25: I'll reply to all of your comments in one comment: As written in the post, the 0.7% fatality rate comes from this source. For your country, simply take the same figure or do some research to estimate the fatality rate in your country. Regarding severe cases: dividing by 5 is like multiplying by 0.2, and 20% means that 1 out of every 5 cases is considered as severe case. 6% and 20% of infected cases for intensive care and severe cases (respectively) are estimates. As I already said in another comment, they may be different in your country. For more information where I found these estimates, see the end of this article where the author applies the analyses to Germany. As for fatality rate, if you believe that these estimates are wrong (and it can be the case for countries other than Belgium), do some research and try to find better estimates for your own country. PS: for your information, typing "????" does not add any value compared to "?", except that it looks aggressive. Regards, Antoine

Comment written by José Moniz Fernandes on April 17, 2020 14:27:12:

I understood.

In Cabo Verde is   # deaths with supposed 1.8% fatality rate  max_infected * 0.018.

Thanks. 

Where control I the first data in the first plot (summary cases), for example, the first confirmed case is 20/03/20. I don't need  before the 15 of mars.

Comment written by Antoine Soetewey on April 17, 2020 16:00:14:

You can exclude the data before March 15 with date >= ymd("2020-03-15") and edit the sir_start_date.

AntoineSoetewey commented 3 years ago

Comment written by Mohammad Abdullah on April 16, 2020 14:36:56: Thank you Anton for this remarkable work I congratulate you about it ,  I have some questions regarding :  Q.1 max_infected / 5? (5) represents what exactly is it (5) Months??  Q.2 # cases with need for intensive care  max_infected * 0.06 ? the number (0.06) what represents?  Q.3 How can we calculate the fatality rate?

Comment written by Antoine Soetewey on April 16, 2020 14:48:13: Thanks for your feedback Mohammad. Q.1 one out of every five cases is considered severe (it's equivalent than max_infected * 0.2)  Q.2 6% of the cases are expected to be in need of intensive care  Q.3 again, we use a 0.7% fatality rate so we multiply by 0.007 These 3 figures were chosen to be in line with other resources I have found online. However, they may be different in your country (and may even vary for Belgium in the future) so if you believe that they are under or overestimated you can choose your own. Hope this helps.  Regards, Antoine

Comment written by Mohammad Abdullah on April 16, 2020 18:30:50: Hi Again Antoin, thank you for your first question answering. Please look at the data of my country (saudi Arabia) : https://drive.google.com/drive/folders/16ZZIKnZ71Jsk2vloxSbRHg24l3vw_mio?usp=sharing  I have problems as the following  

  1. the date is not sorting correctly and the accumilated value have the wrong positon in the correct place of the date after doing the accumilation process.  e.g.: the final accumilated confirmed cases= 4462 should be in the date=12-April-2020 but it write it on (31-March-2020)
  2. I have problem In the date when I convert it to (ymd) format, so always (Infected) = 0 , e.g > Infected  numeric(0) The code I wrote it here: DataSa<-as.data.frame(Saudi_Arabia)  #OR  DataSa<-as.data.frame(Saudi_Arabia_txt)  #DataSa$date<-as.Date(as.character(DataSa$date), format = "%d")  #df<-as.data.frame(DataSa)  #devtools::install_github("RamiKrispin/coronavirus")  #library(coronavirus)  #data(coronavirus)  #data(DataSa)  # extract the cumulative incidence  df1 <- DataSa %>%    dplyr::filter(Province.State == "Saudi Arabia") %>%    dplyr::group_by(date, type) %>%    dplyr::summarise(total = sum(cases, na.rm = TRUE)) %>%    tidyr::pivot_wider(      names_from = type,      values_from = total    ) %>%    dplyr::arrange(date) %>%    dplyr::ungroup() %>%    dplyr::mutate(active = confirmed - death - recovered) %>%    dplyr::mutate(      confirmed_cum = cumsum(confirmed),      death_cum = cumsum(death),      recovered_cum = cumsum(recovered),      active_cum = cumsum(active)    ) # put the daily cumulative incidence numbers for Belgium from library(lubridate)  Infected <- subset(df1, date >= ymd("2020-03-02") & date <= ymd("2020-04-06"))$active_cum  #OR  #Infected <- subset(df1, date >= dym("02-03-2020") & date <= dym("06-04-2020"))$active_cum

Please advise.

Comment written by Mohammad Abdullah on April 16, 2020 18:33:39: Hi Antonie Here is the link of my Csv , Txt file: https://drive.google.com/open?id=16ZZIKnZ71Jsk2vloxSbRHg24l3vw_mio

Comment written by Mohammad Abdullah on April 17, 2020 12:01:57: Hi Antoine, I just send you message please replay to my email.  Thank you

Comment written by Mohammad Abdullah on April 17, 2020 15:45:31: Hi Antonie  I got the following error:

fitted_cumulative_incidence <- fitted_cumulative_incidence %>%  +   mutate(  +     Date = ymd(sir_start_date) + days(t - 1),  +     Country = "Saudi Arabia",  +     cumulative_incident_cases = Infected  +   )  Error: Column cumulative_incident_cases must be length 27 (the number of rows) or one, not 56

Please advise.

Comment written by Antoine Soetewey on April 17, 2020 15:56:42:

The error is due to the fact that the length of the Infected vector is not equal to the number of rows of the column cumulative_incident_cases.

Comment written by Mohammad Abdullah on April 17, 2020 16:37:33:

Ok Now it's running without any errors but the results is not logic , I sent it by email kindly look at it and give me your recommendations.

AntoineSoetewey commented 3 years ago

Comment written by Mohammad Abdullah on April 16, 2020 14:36:56: Thank you Anton for this remarkable work I congratulate you about it ,  I have some questions regarding :  Q.1 max_infected / 5? (5) represents what exactly is it (5) Months??  Q.2 # cases with need for intensive care  max_infected * 0.06 ? the number (0.06) what represents?  Q.3 How can we calculate the fatality rate?

Comment written by Antoine Soetewey on April 16, 2020 14:48:13: Thanks for your feedback Mohammad. Q.1 one out of every five cases is considered severe (it's equivalent than max_infected * 0.2)  Q.2 6% of the cases are expected to be in need of intensive care  Q.3 again, we use a 0.7% fatality rate so we multiply by 0.007 These 3 figures were chosen to be in line with other resources I have found online. However, they may be different in your country (and may even vary for Belgium in the future) so if you believe that they are under or overestimated you can choose your own. Hope this helps.  Regards, Antoine

Comment written by Mohammad Abdullah on April 16, 2020 18:30:50: Hi Again Antoin, thank you for your first question answering. Please look at the data of my country (saudi Arabia) : https://drive.google.com/drive/folders/16ZZIKnZ71Jsk2vloxSbRHg24l3vw_mio?usp=sharing  I have problems as the following  

  1. the date is not sorting correctly and the accumilated value have the wrong positon in the correct place of the date after doing the accumilation process.  e.g.: the final accumilated confirmed cases= 4462 should be in the date=12-April-2020 but it write it on (31-March-2020)
  2. I have problem In the date when I convert it to (ymd) format, so always (Infected) = 0 , e.g > Infected  numeric(0) The code I wrote it here: DataSa<-as.data.frame(Saudi_Arabia)  #OR  DataSa<-as.data.frame(Saudi_Arabia_txt)  #DataSa$date<-as.Date(as.character(DataSa$date), format = "%d")  #df<-as.data.frame(DataSa)  #devtools::install_github("RamiKrispin/coronavirus")  #library(coronavirus)  #data(coronavirus)  #data(DataSa)  # extract the cumulative incidence  df1 <- DataSa %>%    dplyr::filter(Province.State == "Saudi Arabia") %>%    dplyr::group_by(date, type) %>%    dplyr::summarise(total = sum(cases, na.rm = TRUE)) %>%    tidyr::pivot_wider(      names_from = type,      values_from = total    ) %>%    dplyr::arrange(date) %>%    dplyr::ungroup() %>%    dplyr::mutate(active = confirmed - death - recovered) %>%    dplyr::mutate(      confirmed_cum = cumsum(confirmed),      death_cum = cumsum(death),      recovered_cum = cumsum(recovered),      active_cum = cumsum(active)    ) # put the daily cumulative incidence numbers for Belgium from library(lubridate)  Infected <- subset(df1, date >= ymd("2020-03-02") & date <= ymd("2020-04-06"))$active_cum  #OR  #Infected <- subset(df1, date >= dym("02-03-2020") & date <= dym("06-04-2020"))$active_cum

Please advise.

Comment written by Mohammad Abdullah on April 16, 2020 18:33:39: Hi Antonie Here is the link of my Csv , Txt file: https://drive.google.com/open?id=16ZZIKnZ71Jsk2vloxSbRHg24l3vw_mio

Comment written by Mohammad Abdullah on April 17, 2020 12:01:57: Hi Antoine, I just send you message please replay to my email.  Thank you

Comment written by Mohammad Abdullah on April 17, 2020 15:45:31: Hi Antonie  I got the following error:

fitted_cumulative_incidence <- fitted_cumulative_incidence %>%  +   mutate(  +     Date = ymd(sir_start_date) + days(t - 1),  +     Country = "Saudi Arabia",  +     cumulative_incident_cases = Infected  +   )  Error: Column cumulative_incident_cases must be length 27 (the number of rows) or one, not 56

Please advise.

Comment written by Antoine Soetewey on April 17, 2020 15:56:42: The error is due to the fact that the length of the Infected vector is not equal to the number of rows of the column cumulative_incident_cases.

Comment written by Mohammad Abdullah on April 17, 2020 16:37:33:

Ok Now it's running without any errors but the results is not logic , I sent it by email kindly look at it and give me your recommendations.

Comment written by Mohammad Abdullah on April 19, 2020 11:57:24:

Hi Antonie

How can we get and Calculate the residuals from your code?

AntoineSoetewey commented 3 years ago

Comment written by ibrahim mohamed on April 20, 2020 21:19:01:

hi  

beta and gamma how i can culculated  

Are the values always equal to 1 like beta value + gamma vlue = 1 ?

AntoineSoetewey commented 3 years ago

Comment written by ibrahim mohamed on April 20, 2020 21:19:01:

hi  

beta and gamma how i can culculated  

Are the values always equal to 1 like beta value + gamma vlue = 1 ?

Comment written by ibrahim mohamed on April 20, 2020 21:26:34:

in my case my country its Kingdom of saudi arabia

N = total of population, so i want to calculate beta and gamma  
the beta its infection rates. so beta = total confirmed cases / N  
and gamma its recovery rates. so gamma = total recovery / total confirmed cases
i used python im my study 

best regards.

AntoineSoetewey commented 3 years ago

Comment written by ibrahim mohamed on April 20, 2020 21:19:01: hi   beta and gamma how i can culculated   Are the values always equal to 1 like beta value + gamma vlue = 1 ?

Comment written by ibrahim mohamed on April 20, 2020 21:26:34:

in my case my country its Kingdom of saudi arabia

N = total of population, so i want to calculate beta and gamma   the beta its infection rates. so beta = total confirmed cases / N   and gamma its recovery rates. so gamma = total recovery / total confirmed cases i used python im my study 

best regards.

Comment written by Antoine Soetewey on April 20, 2020 21:53:01:

Hi Ibrahim,

No beta + gamma is not always equal to 1. In the literature, it has been estimated that beta and gamma are close to 0.54 and 0.2, respectively.

I am not familiar with Python so I cannot help you to compute beta and gamma with this programming language. You can adapt my R code and follow the same process, or here is a blog post that may be helpful: https://www.lewuathe.com/covid-19-dynamics-with-sir-model.html.

Regards, Antoine

AntoineSoetewey commented 3 years ago

Comment written by Julian on April 26, 2020 14:09:12:

Hi Antoine

I'm working on an assignment where i need to model the corona data for the month of March using excel. 

My question is about calculating the risk of infection.  I interpreted this as the  R value on the day when the number of cases was lower than one. (The end of the pandemic)  Or does this equal to 1-(1/R0) as said in your article?

best regards

AntoineSoetewey commented 3 years ago

Comment written by Julian on April 26, 2020 14:09:12:

Hi Antoine

I'm working on an assignment where i need to model the corona data for the month of March using excel. 

My question is about calculating the risk of infection.  I interpreted this as the  R value on the day when the number of cases was lower than one. (The end of the pandemic)  Or does this equal to 1-(1/R0) as said in your article?

best regards

Comment written by Antoine Soetewey on April 27, 2020 06:40:01:

Dear Julian,

1-(1/R0) is not the risk of infection. See the last 2 paragraphs of this section where I included a reference about this quantity known as the herd immunity threshold. Regarding the risk of infection, I believe that it's simply the probability of being infected. Therefore, it's the number of infected people / total population.

Hope this helps. 

Regards, Antoine

AntoineSoetewey commented 3 years ago

Comment written by Antoine Soetewey on April 20, 2020 21:51:41:

UPDATE: I have changed the fatality rate to 4.5%, which is more in line with reality.

AntoineSoetewey commented 3 years ago

Comment written by Richard Howin on April 30, 2020 14:27:40:

I'm just curious. I've implemented a SIR model like this one for my assignment. Then I have 2 problems: 

  1. When we use accumulative cases and when we use per day cases for our data (I mean, i've seen few of Journal and some of them using cases per day. It makes me confused actually) 
  2. For the population size, I don't know whether it's R limited function or not, but in https://www.mathworks.com/matlabcentral/fileexchange/74658-fitviruscovid19 (using matlab), the code, suggest to make some initial guesses about the population size, and then they said that keep in mind we don't use population of the country, but fitted population size.

May be, could you give me some explantions? (Anyone could give me opinions. It's okay. Hehe.)

AntoineSoetewey commented 3 years ago

Comment written by Elijah K. Samuel on April 30, 2020 15:08:19:

I found your R-Code quite helpful. However, am trying to model the Kenyan data and but keep getting this message: Error: object 'confirmed' not found

AntoineSoetewey commented 3 years ago

Comment written by Elijah K. Samuel on April 30, 2020 15:08:19:

I found your R-Code quite helpful. However, am trying to model the Kenyan data and but keep getting this message: Error: object 'confirmed' not found

Comment written by Richard Howin on April 30, 2020 16:15:02:

I think, you should get your kenyan data by yourself. May be the datasets not included Kenyan Covid-19's data.

AntoineSoetewey commented 3 years ago

Comment written by Elijah K. Samuel on April 30, 2020 15:08:19: I found your R-Code quite helpful. However, am trying to model the Kenyan data and but keep getting this message: Error: object 'confirmed' not found

Comment written by Richard Howin on April 30, 2020 16:15:02:

I think, you should get your kenyan data by yourself. May be the datasets not included Kenyan Covid-19's data.

Comment written by Antoine Soetewey on April 30, 2020 16:18:32:

Dear Richard, 

I just checked and there is data for Kenya. See with df <- subset(coronavirus, Country.Region == "Kenya")

Regards, Antoine

AntoineSoetewey commented 3 years ago

Comment written by Elijah K. Samuel on April 30, 2020 15:08:19:

I found your R-Code quite helpful. However, am trying to model the Kenyan data and but keep getting this message: Error: object 'confirmed' not found

Comment written by Antoine Soetewey on April 30, 2020 16:22:05:

Dear Elijah,

Did you properly install and load the {coronavirus} package? You can do so with:  devtools::install_github("RamiKrispin/coronavirus") library(coronavirus)

After this, check the data with: 
data(coronavirus) View(coronavirus)

Hope this helps.

Regards, Antoine

AntoineSoetewey commented 3 years ago

Comment written by Elijah K. Samuel on April 30, 2020 15:08:19: I found your R-Code quite helpful. However, am trying to model the Kenyan data and but keep getting this message: Error: object 'confirmed' not found

Comment written by Richard Howin on April 30, 2020 16:15:02: I think, you should get your kenyan data by yourself. May be the datasets not included Kenyan Covid-19's data.

Comment written by Antoine Soetewey on April 30, 2020 16:18:32:

Dear Richard, 

I just checked and there is data for Kenya. See with df <- subset(coronavirus, Country.Region == "Kenya")

Regards, Antoine

Comment written by Richard Howin on April 30, 2020 16:31:38:

Ah, okay. Thx for the reply.

AntoineSoetewey commented 3 years ago

Comment written by Richard Howin on April 30, 2020 14:27:40:

I'm just curious. I've implemented a SIR model like this one for my assignment. Then I have 2 problems: 

  1. When we use accumulative cases and when we use per day cases for our data (I mean, i've seen few of Journal and some of them using cases per day. It makes me confused actually)
  2. For the population size, I don't know whether it's R limited function or not, but in https://www.mathworks.com/matlabcentral/fileexchange/74658-fitviruscovid19 (using matlab), the code, suggest to make some initial guesses about the population size, and then they said that keep in mind we don't use population of the country, but fitted population size.

May be, could you give me some explantions? (Anyone could give me opinions. It's okay. Hehe.)

Comment written by Antoine Soetewey on April 30, 2020 16:48:53:

Hello again,

  1. Infected people as defined by the SIR model are the people who are currently infected. It is therefore the cumulative infected minus the removed, i.e. recovered or dead. 
  2. This analysis is partially based on this one and this one. Both authors use the population size for N. This the reason I used the size of the Belgian population for N too. However, I am not an expert and the author of the source you mention has probably good reasons to use something else than the population size for N. For instance, if he doesn't consider that the whole population is susceptible to the disease.

Hope this helps.

Regards, Antoine

AntoineSoetewey commented 3 years ago

Comment written by Richard Howin on April 30, 2020 14:27:40: I'm just curious. I've implemented a SIR model like this one for my assignment. Then I have 2 problems: 

  1. When we use accumulative cases and when we use per day cases for our data (I mean, i've seen few of Journal and some of them using cases per day. It makes me confused actually)
  2. For the population size, I don't know whether it's R limited function or not, but in https://www.mathworks.com/matlabcentral/fileexchange/74658-fitviruscovid19 (using matlab), the code, suggest to make some initial guesses about the population size, and then they said that keep in mind we don't use population of the country, but fitted population size.

May be, could you give me some explantions? (Anyone could give me opinions. It's okay. Hehe.)

Comment written by Antoine Soetewey on April 30, 2020 16:48:53:

Hello again,

  1. Infected people as defined by the SIR model are the people who are currently infected. It is therefore the cumulative infected minus the removed, i.e. recovered or dead.
  2. This analysis is partially based on this one and this one. Both authors use the population size for N. This the reason I used the size of the Belgian population for N too. However, I am not an expert and the author of the source you mention has probably good reasons to use something else than the population size for N. For instance, if he doesn't consider that the whole population is susceptible to the disease.

Hope this helps.

Regards, Antoine

Comment written by Richard Howin on April 30, 2020 16:57:40:

  1. Ah, I see. It makes clear now.  
  2. Hmmm... I agree with you. Maybe it's just different approach for estimate the covid-19 cases.

Thanks for replying! Hopefully, your family and you are in the best condition! God bless.

AntoineSoetewey commented 3 years ago

Comment written by Oka Briantiko on May 02, 2020 18:21:30:

Thankyou sir. I have a question.

When I applied your code to Indonesian data, when I run the code for optim function, the message was ERROR: ABNORMAL_TERMINATION_IN_LNSRCH. Would you help me to solve this? Thankyou sir

AntoineSoetewey commented 3 years ago

Comment written by Elijah K. Samuel on April 30, 2020 15:08:19: I found your R-Code quite helpful. However, am trying to model the Kenyan data and but keep getting this message: Error: object 'confirmed' not found

Comment written by Antoine Soetewey on April 30, 2020 16:22:05:

Dear Elijah,

Did you properly install and load the {coronavirus} package? You can do so with:  devtools::install_github("RamiKrispin/coronavirus") library(coronavirus)

After this, check the data with:  data(coronavirus) View(coronavirus)

Hope this helps.

Regards, Antoine

Comment written by Elijah K. Samuel on May 02, 2020 18:55:39:

It has worked perfectly, and thanks

AntoineSoetewey commented 3 years ago

Comment written by Oka Briantiko on May 02, 2020 18:21:30:

Thankyou sir. I have a question.

When I applied your code to Indonesian data, when I run the code for optim function, the message was ERROR: ABNORMAL_TERMINATION_IN_LNSRCH. Would you help me to solve this? Thankyou sir

Comment written by Antoine Soetewey on May 02, 2020 19:09:22:

Dear Oka,

You can try with different initial values (so other than 0.5), and other upper constraints. If it still doesn't work, try another method than L-BFGS-B.

Hope this helps. 

Best, 
Antoine

AntoineSoetewey commented 3 years ago

Comment written by Elijah K. Samuel on April 30, 2020 15:08:19: I found your R-Code quite helpful. However, am trying to model the Kenyan data and but keep getting this message: Error: object 'confirmed' not found

Comment written by Antoine Soetewey on April 30, 2020 16:22:05: Dear Elijah, Did you properly install and load the {coronavirus} package? You can do so with:  devtools::install_github("RamiKrispin/coronavirus") library(coronavirus) After this, check the data with:  data(coronavirus) View(coronavirus) Hope this helps. Regards, Antoine

Comment written by Elijah K. Samuel on May 02, 2020 18:55:39:

It has worked perfectly, and thanks

Comment written by Elijah K. Samuel on May 02, 2020 20:19:54:

But beta is being estimates as 1.0000 and gamma as 0.8719058? Which seems strange?

AntoineSoetewey commented 3 years ago

Comment written by Elijah K. Samuel on April 30, 2020 15:08:19: I found your R-Code quite helpful. However, am trying to model the Kenyan data and but keep getting this message: Error: object 'confirmed' not found

Comment written by Antoine Soetewey on April 30, 2020 16:22:05: Dear Elijah, Did you properly install and load the {coronavirus} package? You can do so with:  devtools::install_github("RamiKrispin/coronavirus") library(coronavirus) After this, check the data with:  data(coronavirus) View(coronavirus) Hope this helps. Regards, Antoine

Comment written by Elijah K. Samuel on May 02, 2020 18:55:39: It has worked perfectly, and thanks

Comment written by Elijah K. Samuel on May 02, 2020 20:19:54:

But beta is being estimates as 1.0000 and gamma as 0.8719058? Which seems strange?

Comment written by Antoine Soetewey on May 02, 2020 20:27:50:

The fitting process may not be stable. See this post for another potential solution: http://blog.ephorie.de/contagiousness-of-covid-19-part-i-improvements-of-mathematical-fitting-guest-post

AntoineSoetewey commented 3 years ago

Comment written by Lucas Teixeira Arajo on May 07, 2020 16:47:02:

Thank you very much for providing this code. I'm applying to Espirito Santo State in Brasil. I want to fit the model to the theoretical argument, that, a R0<=1 represents a better pandemic scenario. I made some tests to South Korea plotting a smooth down at the infectious curve since May 03 until it hits zero and the R0 converges to something near 1.1. I used the period, from the first day to the last day of infectious observation and maintain the same structure of the code. Can you help me with this problem? 

Thank you.

AntoineSoetewey commented 3 years ago

Comment written by Lucas Teixeira Arajo on May 07, 2020 16:47:02:

Thank you very much for providing this code. I'm applying to Espirito Santo State in Brasil. I want to fit the model to the theoretical argument, that, a R0<=1 represents a better pandemic scenario. I made some tests to South Korea plotting a smooth down at the infectious curve since May 03 until it hits zero and the R0 converges to something near 1.1. I used the period, from the first day to the last day of infectious observation and maintain the same structure of the code. Can you help me with this problem? 

Thank you.

Comment written by Antoine Soetewey on May 08, 2020 16:11:08:

Dear Lucas,

Just to make sure I understand well, you only use data after May 3rd, and although the cases decrease you find a R0 of 1.1?

If that's the case, are you sure the fitting process converged? If not, try again by changing the initial values for beta and gamma and see if it converges. You can also change the constraints or the method in the optim() function.

Hope this helps.

Best, 
Antoine

AntoineSoetewey commented 3 years ago

Comment written by Aniqa on May 09, 2020 09:01:38:

Dear Antoine,

Hope this message finds you well. I am from Bangladesh and was trying to do a coronavirus infection projection for my country using the data we have. I landed on your blog. Thanks, it was really helpful.

I have queries regarding your codes. While using the optim function, I found different values for beta and gamma. I think these estimates are not right. They seem to be local maxima not global. When I used initial values 0.5, 0.5 this did not even converge. Then I tried different seeds, and found different estimates. Could you please help, what did I do wrong?

Thanks in advance for your help.

Best, 
Aniqa

AntoineSoetewey commented 3 years ago

Comment written by Jean claude GBADA on May 10, 2020 15:49:58:

Hello Sir,  

I have a problem. The package "deSolve" is not available for my version of R 3.6.3. It is there any solution I could find or may I download a new version of R?  

Thanks

AntoineSoetewey commented 3 years ago

Comment written by Jean claude GBADA on May 10, 2020 15:49:58:

Hello Sir,  

I have a problem. The package "deSolve" is not available for my version of R 3.6.3. It is there any solution I could find or may I download a new version of R?  

Thanks

Comment written by Antoine Soetewey on May 10, 2020 15:56:13:

Dear Jean Claude, 

Did you try after updating R? 

Best, 
Antoine

AntoineSoetewey commented 3 years ago

Comment written by Ughtie Valdez on May 10, 2020 16:00:00:

Hello Sir,

First of all want to thank you for making the code available! HERO

Next the question: How do I validate my model, or the parameters beta and gamma. What kind of statistical test is there available for SIR. Which is easy to conduct.

I have my presentation tomorrow