Closed jovo closed 8 years ago
I'm going to check off stuff as I get to it.
For 1st point see 145ed50.
I'm going to think about the second point. I don't feel the absolute errors will be especially meaningful since those will change for $N$, $M$, and $B$. I think we really need to point out that since we're scaling a flat line is really good for $\hat{P}$. If you have more specific ideas let me know.
I think point 3 will be solved by point 4.
For point 5. We say that we simulated the results and they do comparably at N=500 and M=100. We decided earrlier not to include the numerics since they basically overlapped with the theory.
these changes are great. here are a few more minor ones:
note that i'm doing it this way so that you can see the thought process. i would be easier for me to actually make the change, but i believe this is more helpful. open to other suggestiosn.
Often, the sample or cohort size is relatively small, whereas the number of potential edges is much larger.
- [x] "nowawads" is too informal.
- [x] naive --> na\"ive
- [x] "some bias with greatly" --> "some bias BUT greatly"
- [x] add citation for BV trade-off, eg, trunk
- [x] after stein, mention the explicit result, it is amazing and will get them thinking
- [x] "doesn't close the door" is too informal
- [x] "weigted adjacency matrix with weights given by the proportion of times the 26 corresponding edge appears in the population." i wouldn't say that. we define the mean graph as the Expectation of A with respect to the distribution of A. right? it holds for any graph distribution, it is just the first moment of the distribution.
- [x] "Intuitively, an estimator incorporating the mean-graph 39 structure is preferable to the entry-wise MLE." sentence is weird to me. what is "mean graph structure"? i tend to think of "mean" as an estimator, and "expectation" as the population statistic/property of the distribution. but basically, i'd say an estimator incorporating properties of the distribution is preferable assuming it is computationally tractable.
- [x] "Using the estimates of the latent positions based on a truncated eigen-decomposition 52 of the adjacency matrix, in the RDPG setting we consider an estimator for the mean of 53 the collection of graphs which captures the low-rank structure of the RDPG model. 54" run-on sentence?
- [x] "real data analysis that it frequently outperforms the element-wise MLE" try to avoid the word "it" whenever possible, and use the name of our estimator (did we name it, we need to name it)?
- [x] "small sample size" we use that term often. however, without knowing the number of vertices, it is always a relative term. i think we should clarify (small sample size for a given graph size), or something like that.
- [x] "Each vertex represent " missing 's'
- [x] "Each vertex represent a well defined anatomical region present in each subject, and an 68 edge between two regions is defined to exist if correlation in activity between the regions 69 surpasses a certain threshold. Similarly, for structural brain imaging an edge may 70 represent the presence of anatomical connections between the two regions." is weird. let's be super clear and accurate. we don't consider any correlation data, so that comes out of nowhere. if we want to say that, we should put it in context. same with the structural data, mention fMRI or diffusion MRI. i can fix it up after you take a crack at it.
- [x] "We consider three nested models" SBM & RDPG are not nested. they overlap. positive definite SBM is a special case of RDPG. let's be correct, we are setting an example for them.
- [x] "mean graph is the For this case, " somehting screwed up
- [x] "For this case, we aim to estimate the mean matrix P = E[A(m)] 80 base on the observed adjacency matrices A(1),...,A(M)." why is there an "^(m)" in the Expectation? that does not seem right. i don't really understand what this sentence is trying to do though?
- [x] "njoys the 86 many asymptotic properties of the MLE as M → ∞. " for fixed "n", right? should we say that?
- [x] S2.2 says that we don't exploit graph structure. true, but we haven't introduced the possibiliyt of graph strucutre. i would re-order: IEM, SBM, RDPG, and then maybe LPG. in methods, always go from conceptually most simple to more complicated. having taught this to neuroscientists many times, i can assure you the order is IEM, SBM, RDPG, and then LPG (many don't know dot products, and certainly haven't been introduced to kernels, etc.). then, i would introduce our estimators. note that when we move SBM, the text will change to elaborate. eg, discuss SBM in the context of a mixture of ER graphs, provide some intuition is to why this is the simplest possible generalization of ER graphs, and the SBM as a RDPG goes after the RDPG section.
- [x] "Additionally, there are no useful 91 asymptotic properties for A ̄ as the number of vertices N becomes large." i don't think this is true. i bet as long as N/M --> 0, we still have a bunch of useful asyptotics.
- [x] in general, i provide intuition to this community before equations, eg, line 112.
- [x] alg 1:
- what is "kmax"
- line 4 is confusing. we dont' use Abar+D, we select the dimension of it using something else. specify how?
- similar issue for line 5: specify how
- output is typically right beneath input, neither has line number
- [x] in lemma 3.1, i don't understand why we have 2 claims. it seems like the latter covers the former?
- [x] theorem 3.1 refers to lemma A.3, which has not yet been referenced.
- [ ] "Also, the ARE does not 204 depend on the number of graphs M, so the larger the graphs are, the better Pˆ is 205 relative to A ̄, regardless of M. " this was a surpising result, right? not in retrospect perhaps, but we didn't realize that would be true?! we should highlight that! in general, we should be more clear about our contributions, we defined a new estimator, we proved RE results, in particular, that do not depend on M!
there are more minor notes coming.
Relative efficiency (for square error loss) is the ratio of MSEs. MSEs themselves are of course expectations. https://en.wikipedia.org/wiki/Efficiency_(statistics)
oh right, mean square error. my bad.
On Mon, Aug 8, 2016 at 9:35 AM, Daniel Sussman notifications@github.com wrote:
Relative efficiency (for square error loss) is the ratio of MSEs. MSEs themselves are of course expectations. https://en.wikipedia.org/wiki/ Efficiency_(statistics)
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/jhu-graphstat/LLG/issues/1#issuecomment-238238175, or mute the thread https://github.com/notifications/unsubscribe-auth/AACjcitrv9FoSnZpZWVPH8pSVOjC9-qOks5qdzC4gaJpZM4JcrMj .
the glass is all full: half water, half air. neurodata.io, jovo calendar https://calendar.google.com/calendar/embed?src=joshuav%40gmail.com&ctz=America/New_York
I think the naive vs na\"ive doesn't matter. Both are acceptable, with naive being more common. http://www.merriam-webster.com/dictionary/naive
I didn't change the order of IEM, RDPG, SBM but I did provide more intuition for the RDPG.
I guessed at what this was reffering to:
in general, i provide intuition to this community before equations, eg, line 112.
And tried (poorly ;-)) to implement this in general
Plos wants Fig. X
@TangRunze @jovo
"Rank-based methods and robust likelihood methods could be very useful in that case. " provide a citation.
I think we want to cite Runze's in prep paper?
i would have cited Huber and Lq likelihood papers that exist.
On Wed, Aug 10, 2016 at 1:42 PM, Daniel Sussman notifications@github.com wrote:
@TangRunze https://github.com/TangRunze @jovo https://github.com/jovo
"Rank-based methods and robust likelihood methods could be very useful in that case. " provide a citation.
I think we want to cite Runze's in prep paper?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jhu-graphstat/LLG/issues/1#issuecomment-238944928, or mute the thread https://github.com/notifications/unsubscribe-auth/AACjcuFodAYun0ennFjMQffaWo3oUEPoks5qeg2HgaJpZM4JcrMj .
the glass is all full: half water, half air. neurodata.io, jovo calendar https://calendar.google.com/calendar/embed?src=joshuav%40gmail.com&ctz=America/New_York
Ahh, ok cool that's easy. I was thinking too specifically.
on the real data, we do a bunch of great analysis, but we skipped that on the theory/simulated data. here are some related specific comments:
second, even though our estimators don't have better MSE, they are better in other ways. in particular, the low-rank approximations yield better interpretability, enable us to more easily identity vertices that are interesting, etc. we probably want a sentence about each of those in the results, and then a few more in the discussion.