Closed ben-domingue closed 3 months ago
Working on this. The VerbAgg2 dataset is already processed and included in the repository with title "verbagg" while VerbAgg3 isn't. Probably need to add VerbAgg3 and rename "verbagg" to accomodate both datasets.
Paper also contains 4 additional simulated datasets.(not included and processed as discussed) :)
# https://www.jstatsoft.org/article/view/v048c01
library(dplyr)
library(tidyverse)
library(tidyr)
load("./Data/VerbAgg2.rda")
VerbAgg2_id <- 1:nrow(VerbAgg2)
VerbAgg2 <- cbind(VerbAgg2, id=I(VerbAgg2_id)) # Merge id column into the matrix
VerbAgg2 <- VerbAgg2[, !colnames(VerbAgg2) %in% c("anger", "gender")]
VerbAgg2 <- as.data.frame(VerbAgg2)
VerbAgg2_long <- pivot_longer(VerbAgg2, cols=-id, names_to='item', values_to='resp') # Reshape VerbAgg2 data to long format
load("./Data/VerbAgg3.rda")
VerbAgg3_id <- 1:nrow(VerbAgg3)
VerbAgg3 <- cbind(VerbAgg3, id=I(VerbAgg3_id))
VerbAgg3 <- VerbAgg3[, !colnames(VerbAgg3) %in% c("anger", "gender")]
VerbAgg3 <- as.data.frame(VerbAgg3)
VerbAgg3_long <- pivot_longer(VerbAgg3, cols=-id, names_to='item', values_to='resp')
save(VerbAgg2_long, file="itrees_VerbAgg2.Rdata")
save(VerbAgg3_long, file="itrees_VerbAgg3.Rdata")
write.csv(VerbAgg2_long, "VerbAgg2.csv", row.names = FALSE)
write.csv(VerbAgg3_long, "VerbAgg3.csv", row.names = FALSE)
The paper also contains 2 additional datasets:
load("./fsdatT.rda")
fsdatT <- fsdatT %>% select(-node, -sub)
fsdatT <- fsdatT %>% rename(resp=value, id=person)
fsdatT$id <- sub("^p", "", fsdatT$id) # Convert ids into integers
fsdatT$id <- as.integer(fsdatT$id)
save(fsdatT, file="itrees_fsdatT.Rdata")
write.csv(fsdatT, "fsdatT.csv", row.names=FALSE)
so these are different observations of the same individual? if so, let's perhaps leave this one as-is for the moment. i need to make a guiding decision about this kind of use case in the coming weeks and think it might be best to return to it at that point.
so these are different observations of the same individual? if so, let's perhaps leave this one as-is for the moment. i need to make a guiding decision about this kind of use case in the coming weeks and think it might be best to return to it at that point.
Ye, these are observations of the same participants when they come back for therapies 0, 1, and 2. I will leave it as it is for now.
I will also try to merge the code scripts into one with the above code combined and update to this issue later. Perhaps make a PR later.
Thanks for the clarification :)
Complete code file and processed datasets of the paper. I have them in Rdata format but GitHub won't allow me to upload them.... fsdatT.csv stressT.csv VerbAgg2.csv VerbAgg3.csv
# https://www.jstatsoft.org/article/view/v048c01
library(dplyr)
library(tidyverse)
library(tidyr)
load("./stressT.rda")
write.csv(stressT, "stressT.csv", row.names=FALSE)
stressT <- stressT |>
select(-exo1, -exo2, -exo3, -exo4, -exo5) |> # Remove columns for decision-tree model
rename(id=person,
resp=value,
item=crossitem)
load("./fsdatT.rda")
fsdatT <- fsdatT %>% select(-node, -sub)
fsdatT <- fsdatT %>% rename(resp=value, id=person)
fsdatT$id <- sub("^p", "", fsdatT$id) # Convert ids into integers
fsdatT$id <- as.integer(fsdatT$id)
load("./VerbAgg2.rda")
VerbAgg2_id <- 1:nrow(VerbAgg2)
VerbAgg2 <- cbind(VerbAgg2, id=I(VerbAgg2_id)) # Merge id column into the matrix
VerbAgg2 <- VerbAgg2[, !colnames(VerbAgg2) %in% c("Anger", "Gender")]
VerbAgg2 <- as.data.frame(VerbAgg2)
VerbAgg2_long <- pivot_longer(VerbAgg2, cols=-id, names_to='item', values_to='resp') # Reshape VerbAgg2 data to long format
load("./VerbAgg3.rda")
VerbAgg3_id <- 1:nrow(VerbAgg3)
VerbAgg3 <- cbind(VerbAgg3, id=I(VerbAgg3_id))
VerbAgg3 <- VerbAgg3[, !colnames(VerbAgg3) %in% c("Anger", "Gender")]
VerbAgg3 <- as.data.frame(VerbAgg3)
VerbAgg3_long <- pivot_longer(VerbAgg3, cols=-id, names_to='item', values_to='resp')
save(fsdatT, file="fsdatT.Rdata")
save(stressT, file="stressT.Rdata")
save(VerbAgg2_long, file="VerbAgg2.Rdata")
save(VerbAgg3_long, file="VerbAgg3.Rdata")
write.csv(fsdatT, "fsdatT.csv", row.names=FALSE)
write.csv(stressT, "stressT.csv", row.names=FALSE)
write.csv(VerbAgg2_long, "VerbAgg2.csv", row.names = FALSE)
write.csv(VerbAgg3_long, "VerbAgg3.csv", row.names = FALSE)
OK let me go through these separately: VerbAgg2.csv VerbAgg3.csv
fsdatT.csv
stressT.csv
Actually, we are fine here. Great work @KingArthur0205 !! See https://github.com/ben-domingue/irw/blob/main/data/IRTrees.R
Need to triple check that we don't have this data: https://www.jstatsoft.org/article/view/v048c01