trinker / qdap

Quantitative Discourse Analysis Package: Bridging the gap between qualitative data and quantitative analysis
http://cran.us.r-project.org/web/packages/qdap/index.html
175 stars 43 forks source link

Look at adding adjacency matrix #8

Closed trinker closed 12 years ago

trinker commented 12 years ago

Look at adding the adjacency matrix and an example with igraph showing social network (common words)

library(pacman)
p_load(stats4, rgl, tcltk, RSQLite, digest, graph, Matrix, RSQLite, igraph)
##########################################
set.seed(10)
X <- matrix(rpois(100, 1), 10, 10)
colnames(X) <- paste0("Guy_", 1:10)
rownames(X) <- c('The', 'quick', 'brown', 'fox', 'jumps', 
    'over', 'a', 'bot', 'named', 'Dason')

X                               #frequency matrix
Y <- X >= 1
Y <- apply(Y, 2, as, "numeric") #boolean matrix
rownames(Y) <- rownames(X)
Z <- Z2 <- t(Y) %*% Y                      #adjacency matrix
data.frame(
Z2[!lower.tri(Z2)] <- NA
Z2 <- Z2[-1, -ncol(Z2)]
print(Z2, na.print="", quote=FALSE)

colSums(Y)
###############
# build a graph from the above matrix
g <- graph.adjacency(Z, weighted=TRUE, mode ='undirected')
# remove loops 
g <- simplify(g)
# set labels and degrees of vertices
V(g)$label <- V(g)$name
V(g)$degree <- degree(g)

#Plot a Graph
# set seed to make the layout reproducible
set.seed(3952)
layout1 <- layout.fruchterman.reingold(g)
plot(g, layout=layout1)
plot(g, layout=layout.kamada.kawai)
tkplot(g, layout=layout.kamada.kawai)

p_help(igraph)
V(g)$label.cex <- 2.2 * V(g)$degree / max(V(g)$degree)+ .2
V(g)$label.color <- rgb(0, 0, .2, .8)
V(g)$frame.color <- NA
egam <- (log(E(g)$weight)+.4) / max(log(E(g)$weight)+.4)
E(g)$color <- rgb(.5, .5, 0, egam)
E(g)$width <- egam
# plot the graph in layout1
plot(g, layout=layout1)

See these links: R data minging link TS question SO qusetion

trinker commented 12 years ago

make it take a text var and a list of factors

trinker commented 12 years ago

takes a matrix instead (wfm or terco.d/termco.c). Complete but still experimenting with igraph to see its potential.

trinker commented 12 years ago

See if I can stipulat the distances asin:

The distance matrix has in position (i,j) the distance between vertices vi and vj . The
distance is the length of a shortest path connecting the vertices. Unless lengths of
edges are explicitly provided, the length of a path is the number of edges in it. The
distance matrix resembles a high power of the adjacency matrix, but instead of telling
only whether or not two vertices are connected (i.e., the connection matrix, which
contains boolean values), it gives the exact distance between them.

from wikipedia

If so has potential to use correlation values as the distances.