ben-domingue / irw

Code related to data for the Item Response Warehouse
https://datapages.github.io/irw/
7 stars 12 forks source link

Kretzschmar_2017_PTAM #458

Closed ben-domingue closed 1 month ago

ben-domingue commented 1 month ago

License: CC-By Attribution 4.0 International

Description: Kretzschmar, A., Hacatrjana, L., & Rascevska, M. (2017). Re-evaluating the psychometric properties of MicroFIN: A multidimensional measurement of complex problem solving or a unidimensional reasoning test? Psychological Test and Assessment Modeling, 59(2), 157–182.

https://osf.io/wp3z4/

(Edit by Arthur) Paper: https://www.researchgate.net/publication/318014800_Re-evaluating_the_psychometric_properties_of_MicroFIN_A_multidimensional_measurement_of_complex_problem_solving_or_a_unidimensional_reasoning_test

KingArthur0205 commented 1 month ago

This paper has 1 dataset with merged responses from 3 different questionnaires and 362 participants. Not all of the participants were involved in every test.

Each item was scored dichotoumously(0=wrong and 1=correct). Partial credits are given for a close response. Note: There are large number of NA responses in the original dataset as some participants didn't attend all sub-tests within a questionnarie.

  1. MicroFIN: Include initial state task, knowledge acquisition task, and control task, each of which has 6 items. The aggregate score are given in finX where X is 2 to 7.

Note: Participant 222 didn't attend wave 1 but provided responses for wave 2. This participant has been manually removed just to be safe.

  1. Raven's Matrix Task: 20 items. 123 participants provided all NA responses.
  2. Reasoning: Analogies Task(20 items) and Numerical Task(16 items). 128 participants provided all NA responses.
KingArthur0205 commented 1 month ago

Zipped Version(CSV and Rdata): PTAM_Kretzschmar_2017_Rdata.zip PTAM_Kretzschmar_2017_CSV.zip

Individual Datasets: PTAM_Kretzschmar_2017_MicroFIN.csv PTAM_Kretzschmar_2017_Raven.csv PTAM_Kretzschmar_2017_Reasoning.csv

Code:

# Paper: https://www.researchgate.net/publication/318014800_Re-evaluating_the_psychometric_properties_of_MicroFIN_A_multidimensional_measurement_of_complex_problem_solving_or_a_unidimensional_reasoning_test
# Data: https://osf.io/wp3z4/
library(haven)
library(dplyr)
library(tidyr)

# Remove participants whose responses are all NAs
remove_na <- function(df) {
  df <- df[!(rowSums(is.na(df[, -which(names(df) == "id")])) == (ncol(df) - 1)), ]
  return(df)
}

df <- read.csv("data_microfin_final.csv")
df <- df |>
  select(-age, -gender, -fin, -fin_re, -reas.fig, -reas.verb, 
         -reas.num, -starts_with("reas.tot"))

# ------ Process MicroFIN Dataset ------
microfin_df <- df |>
  select(id, starts_with("zi"), starts_with("iz"), 
         starts_with("know"), starts_with("c"))
microfin_df <- microfin_df %>%
  filter(id != 222)
microfin_df <- remove_na(microfin_df)

microfin_test_df <- microfin_df |>
  select(-ends_with("re"))
microfin_test_df <- remove_na(microfin_test_df)

microfin_test_df <- pivot_longer(microfin_test_df, cols=-id, names_to="item", values_to="resp")
microfin_test_df$wave <- 0

microfin_retest_df <- microfin_df |>
  select(id, ends_with("re"))
microfin_retest_df <- remove_na(microfin_retest_df)
colnames(microfin_retest_df) <- gsub('_re', '', colnames(microfin_retest_df))
microfin_retest_df <- pivot_longer(microfin_retest_df, cols=-id, names_to="item", values_to="resp")
microfin_retest_df$wave <- 1

microfin_df <- rbind(microfin_retest_df, microfin_test_df)
save(microfin_df, file="PTAM_Kretzschmar_2017_MicroFIN.Rdata")
write.csv(microfin_df, "PTAM_Kretzschmar_2017_MicroFIN.csv", row.names=FALSE)
# ------ Process Raven's Matrix Task Dataset ------
raven_df <- df |>
  select(id, starts_with("rav"))
raven_df <- remove_na(raven_df)
raven_df <- pivot_longer(raven_df, cols=-id, names_to="item", values_to = "resp")

save(raven_df, file="PTAM_Kretzschmar_2017_Raven.Rdata")
write.csv(raven_df, "PTAM_Kretzschmar_2017_Raven.csv", row.names=FALSE)
# ------ Process Reasoning Dataset ------
res_df <- df |>
  select(id, starts_with("qr"), starts_with("verb"))
res_df <- remove_na(res_df)
res_df <- pivot_longer(res_df, cols=-id, names_to="item", values_to = "resp")

save(res_df, file="PTAM_Kretzschmar_2017_Reasoning.Rdata")
write.csv(res_df, "PTAM_Kretzschmar_2017_Reasoning.csv", row.names=FALSE)
KingArthur0205 commented 1 month ago

PR for this issue: https://github.com/ben-domingue/irw/pull/467/files

ben-domingue commented 1 month ago

@KingArthur0205 for the microfin data, i'm conncerned about these values:

   0 0.25  0.5 0.75    1 
2045  832 1988  643 1396 

do they comment on these? i'm looking at the paper and not seeing anything. are these repeated trials? i think we should have a sidebar discussion about how to deal with fractional values in cases such as this as it has come up a few times. i'm worried that leaving them as fractional could sometimes be a problem for software. in the last case the imputation was clearly a problem. here there are way more so it is less clear.

KingArthur0205 commented 1 month ago

@KingArthur0205 for the microfin data, i'm conncerned about these values:

   0 0.25  0.5 0.75    1 
2045  832 1988  643 1396 

do they comment on these? i'm looking at the paper and not seeing anything. are these repeated trials? i think we should have a sidebar discussion about how to deal with fractional values in cases such as this as it has come up a few times. i'm worried that leaving them as fractional could sometimes be a problem for software. in the last case the imputation was clearly a problem. here there are way more so it is less clear.

  1. They conducted an additional trial with 39 participants.
  2. They use fractional numbers to measure how close the responses are to the ground truth values. For example, as shown in the image below, a higher fraction is awarded if the selected answer is closer to the actual value.
截屏2024-09-22 15 50 25

Probably just cut the fractional numbers down to 0 or re-scale them to integers?(0 to 4)

ben-domingue commented 1 month ago

yeah here i think we should rescale 0/4. great find. i was looking in the paper but didn't catch this nuance.

KingArthur0205 commented 1 month ago

yeah here i think we should rescale 0/4. great find. i was looking in the paper but didn't catch this nuance.

Will do :)

KingArthur0205 commented 1 month ago

New dataset for MicroFin. Now the scale is from 0 to 4: PTAM_Kretzschmar_2017_MicroFIN.csv

I'll update the code if you think this looks good @ben-domingue ;)

ben-domingue commented 1 month ago

sorry i'm realizing there is one more issue. what is this:

  id item resp
1  1 rav1    1
2  1 rav2    1
3  1 rav3    1
4  1 rav4    1
5  1 rav5    1
6  1 rav6    1

   0    1    2 
2431 2348    1 

the one response of 2 seems either wrong or unhelpful

KingArthur0205 commented 1 month ago

Sorry that was an oversight on my end. That 2 was in the original dataset and it's clearly a mistake. I will get rid of that participant. PTAM_Kretzschmar_2017_Raven.csv

ben-domingue commented 1 month ago

@KingArthur0205 just wanting to make sure code gets updated here. see comment i had made. https://github.com/ben-domingue/irw/pull/467/commits/9dbadbfff1eab5a3576bd1543fe1e3488e485eaf