neuroelectro / neuroelectro_neurotree

integration between neuroelectro and neurotree
1 stars 0 forks source link

general analysis ideas #10

Open stripathy opened 8 years ago

stripathy commented 8 years ago

Adam Calhoun posted some simple analyses of the Cosyne Abstracts across time. Maybe they're inspiring for the kinds of things we'd want to do? https://neuroecology.wordpress.com/2016/02/23/cosyne2016-by-the-numbers/

rgerkin commented 8 years ago

Cool. I just asked him on twitter if he's got the code to share, since those results look right enough to me that I'd trust applying it here. Then we can cite his blog post!

stripathy commented 8 years ago

Yeah! @svdavid if you'll be at cosyne, you should try talking with Adam Calhoun about what we're doing. Adam's a really cool and nice guy. We had a 3 hour dinner at cosyne a couple years back talking about the neuroscience of why cats are so popular on the internets.

svdavid commented 8 years ago

That's pretty cool. Yeah I will be at Cosyne and will look him up.

On Tue, Feb 23, 2016 at 1:22 PM, Shreejoy Tripathy <notifications@github.com

wrote:

Yeah! @svdavid https://github.com/svdavid if you'll be at cosyne, you should try talking with Adam Calhoun about what we're doing. Adam's a really cool and nice guy. We had a 3 hour dinner at cosyne a couple years back talking about the neuroscience of why cats are so popular on the internets.

— Reply to this email directly or view it on GitHub https://github.com/neuroelectro/neuroelectro_neurotree/issues/10#issuecomment-187915745 .

rgerkin commented 8 years ago

@stripathy @svdavid I didn't hear from him but I implemented something similar in 2cc98bc, which created this. The results are reasonably intuitive.

I also tried non-negative matrix factorization and sparse PCA, but I got less intuitive results, and I think part of the reason is that because there aren't really obvious clusters in the network, those algorithms don't really get you anything.

One limitation is that many of the ancestors aren't actually in the adjacency matrix (because they aren't in the distance matrix). Only 58 of the 437 ancestors (as marked 'p0' in the distance file @svdavid provided) are also listed as nodes ('p1' or 'p2') in that file. So I'm not sure what the criteria were for inclusion/exclusion. Since neurotree is more of a tree than a bush, and there are more entries with each generation, the most connected people (i.e. having the most edges between themselves and other nodes) are likely to be the people approximately one generation in the past. I don't know if this is a bias we should be trying to correct, but I guess it depends what the point of all of this is.

svdavid commented 8 years ago

Is the fact that p0 is not in the NE author list a problem? p0 is the id of the person in the tree that joins p1 and p2 (ie, the closest common ancestor). The value of p0 in my mind was simply a reference for clustering. Is there some other info you'd like about those nodes to make your analysis work?

stephen

On Thu, Feb 25, 2016 at 1:16 PM, Richard C Gerkin notifications@github.com wrote:

@stripathy https://github.com/stripathy @svdavid https://github.com/svdavid I didn't hear from him but I implemented something similar in 2cc98bc https://github.com/neuroelectro/neuroelectro_neurotree/commit/2cc98bcfe8aaaf4520685ea9c90ab202a9056e45, which created this https://github.com/neuroelectro/neuroelectro_neurotree/blob/master/eigenvectors.ipynb. The results are reasonably intuitive.

I also tried non-negative matrix factorization and sparse PCA, but I got less intuitive results, and I think part of the reason is that because there aren't really obvious clusters in the network, those algorithms don't really get you anything.

One limitation is that many of the ancestors aren't actually in the adjacency matrix (because they aren't in the distance matrix). Only 58 of the 437 ancestors (as marked 'p0' in the distance file @svdavid https://github.com/svdavid provided) are also listed as nodes ('p1' or 'p2') in that file. So I'm not sure what the criteria were for inclusion/exclusion. Since neurotree is more of a tree than a bush, and there are more entries with each generation, the most connected people (i.e. having the most edges between themselves and other nodes) are likely to be the people approximately one generation in the past. I don't know if this is a bias we should be trying to correct, but I guess it depends what the point of all of this is.

— Reply to this email directly or view it on GitHub https://github.com/neuroelectro/neuroelectro_neurotree/issues/10#issuecomment-188992517 .

rgerkin commented 8 years ago

@svdavid david Right, I was thinking that the p0-type people would become the eigenvectors, if they were included, so maybe they should be? What was the cutoff, anyway? Being in neuroelectro? Being in pubmed?

stripathy commented 8 years ago

@rgerkin I think the cutoff was "being a last author in neuroelectro", but we discussed whether it made more sense for this matrix to also include the p0 people as well (I think it should).

svdavid commented 8 years ago

@rgerkin @stripathy See comment about fingerprint_mtx in #2. I think this may resolve the issue?