Closed Snowcatcat closed 1 year ago
Hi,
I think there is no way to get the predicted values by test market, at least as such the predictions for each market are different. From what I see on the Augmented Synthetic Control paper (which is the base for the library) and on the code, all countries in the test are bundles together and modelled as a single unit. As such, the model doesn't predict the values per market, but instead for the average (y_hat) or the whole (data$t_obs = y_hat * n_countries).
What I would do to get the predicted values by market is to run the function once per market, while remembering to remove the other treated markets from the dataset. Below is an example based on the Walkthrough
data(GeoLift_Test)
treated_locations = c("chicago", "portland")
output = NULL
for (treated_location in treated_locations) {
filtered_GeoLift_Test = GeoLift_Test %>%
filter((location == treated_location) | !(location %in% treated_locations))
GeoTestData_Test <- GeoDataRead(data = filtered_GeoLift_Test,
date_id = "date",
location_id = "location",
Y_id = "Y",
X = c(), #empty list as we have no covariates
format = "yyyy-mm-dd",
summary = TRUE)
GeoTest <- GeoLift(Y_id = "Y",
data = GeoTestData_Test,
locations = c(treated_location),
treatment_start_time = 91,
treatment_end_time = 105)
if (is.null(output)){
output = GeoTest$y_hat
}else{
output = cbind(output, GeoTest$y_hat)
}
}
output = as.data.frame(output)
names(output) <- treated_locations
output$total = apply(output, 1, function(x) sum(x[1:length(treated_locations)]))
GeoTestData_Test <- GeoDataRead(data = GeoLift_Test,
date_id = "date",
location_id = "location",
Y_id = "Y",
X = c(), #empty list as we have no covariates
format = "yyyy-mm-dd",
summary = TRUE)
GeoTest <- GeoLift(Y_id = "Y",
data = GeoTestData_Test,
locations = treated_locations,
treatment_start_time = 91,
treatment_end_time = 105)
output$grouped_result = GeoTest$y_hat * 2 # because y_hat is the average of the 2 locations
In the walkthrough example, it seems that there are 2 ways to get the predicted values:
The first line of the code only shows the values for the first market. The second line of the code only shows the values for the entire test group as a whole.
Is there a way to get predicted values by test market?