aguestuser / social-network-analysis-sandbox

problem sets and toy projects for learning Social Network Analysis
GNU General Public License v3.0
0 stars 0 forks source link

example/mock data sets #1

Open kylenstone opened 6 years ago

kylenstone commented 6 years ago

@aguestuser It would be nice to add some simple data sets to this repo, such that the algorithms and methods explored herein are easier to exemplify and confirm results for.

Let's use this thread to brainstorm what datasets could be included here. Since I am not strong data analyst, I'm going to hunt for very simple datasets.

Data already in LittleSis, and may require heavy analysis:

aguestuser commented 6 years ago

+1.

have a few ideas about "hidden data" to unearth, but first wanted to explicate the (somewhat cryptic) example that's already in the codebase...

it's the first step to solving the example problem at the end of chapter 4 in John Scott's Social Network Analysis, which is basically all about interlocks (a graph shape with which LS is already somewhat obsessed).

the example problem is all about taking an incidence matrix (ie: a table of people who work in various orgs) and turning it into two adjacency matrices (ie: people connected to people via orgs or orgs connected to orgs via people). interestingly enough, that's exactly what littlesis's interlocks tables show:

people connected to people: https://littlesis.org/person/1345-Lloyd_C_Blankfein/interlocks

orgs connected to orgs: https://littlesis.org/org/20-Goldman_Sachs/interlocks

The text of the example problems reads thusly:

Download your preferred computer program and produce a data file for the data below. These data show the attendance of ten social workers at four national training sessions concerned with child welfare, professional ethics, record keeping and legal responsibilities. The attendance is as follows:

Social Worker Events Attended
Margery Allingham Child welfare, Professional ethics, Record keeping, Legal responsibilities
Emily Bronte Child welfare, Record Keeping
Truman Capote Professional ethics, Record keeping
Len Deighton Child welfare
Mary Ann Evans Legal responsibilities
Scott Fitzgerald Professional ethics, Record keeping, Legal responsibilities
Elizabeth Gaskell Professional ethics, Legal responsibilities
Geoffrey Household Child welfare, Professional ethics
Hammond Innes Child welfare, Legal responsibilities
Erica James Child welfare, Record keeping

Using the program print the incidence matrix and then generate and print the adjacency matrix of social workers.

If you feel brave, see if you can use the drawing facilities of the program to produce a sociogram.

aguestuser commented 6 years ago

^^-- @kconn, @aepyornis: care to farm out some experimental data science research questions to a couple of eager volunteers? what sorts of things would it be really cool if math could turn up for campaigners but that there's not really time to devote paid dev time to building during the week? dream up some problems and the LS Experimental R&D Team will dive into them!

(also: feel free to "unwatch" this repo. a lot of the commits will be really training-wheels-type Austin Learns Data Science type stuff.)

aguestuser commented 6 years ago

edit: you can farm out your dreamy experimental research problems to the LittleSis Deep Insights Team (TM)