Open bmkramer opened 8 years ago
Idea 1 :
clustering of answers.
https://cran.r-project.org/web/views/Cluster.html
Probably need to recode into 1 and 0 for every category.
Name MENDREAD READCUBE
person 1 1 0 etc
Can make several distance measures and cluster on them
idea 2: some form of classification or random forests
If you deal with binary data (0 or 1) you will need special tools I think. (random forest do not deal with binary data I think)
For clustering, you often need to set the number of cluster you want a priori.
Maybe the easiest is to set question you want to have answered (difference between humanities and sciences, for example) and then use a PCA and look for a difference on PC1. Careful: you need to set your questions before you touch the data, or you will end up p-hacking your data!
Compare tool usage (per research activity) for different user groups (discipline, research role, career length, country)
relation with https://github.com/bmkramer/101innovations-survey-data/issues/3