Closed editorialbot closed 2 years ago
Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.
For a list of things I can do to help you, just type:
@editorialbot commands
For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:
@editorialbot generate pdf
Software report:
github.com/AlDanial/cloc v 1.88 T=0.03 s (1302.3 files/s, 183683.1 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
R 29 496 1142 2504
Markdown 5 132 0 464
TeX 1 55 0 393
Rmd 1 76 115 169
YAML 4 14 8 74
-------------------------------------------------------------------------------
SUM: 40 773 1265 3604
-------------------------------------------------------------------------------
gitinspector failed to run statistical information for the repository
Wordcount for paper.md
is 1855
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):
OK DOIs
- 10.1016/0304-4076(86)90002-3 is OK
- 10.1007/978-1-4899-3242-6 is OK
- 10.18637/jss.v023.i07 is OK
- 10.1007/978-0-387-21706-2 is OK
- 10.18637/jss.v032.i10 is OK
- 10.18637/jss.v027.i08 is OK
- 10.18637/jss.v012.i03 is OK
- 10.5281/zenodo.3879620 is OK
MISSING DOIs
- 10.21236/ada208388 may be a valid DOI for title: Sampling-based approaches to calculating marginal densities
- 10.1201/b14835-11 may be a valid DOI for title: Strategies for improving MCMC
- 10.1201/9781584887218 may be a valid DOI for title: Applied Bayesian hierarchical methods
- 10.1007/978-3-642-21295-6_2 may be a valid DOI for title: Practical probabilistic programming
- 10.1016/j.jmva.2009.04.008 may be a valid DOI for title: Generating random correlation matrices based on vines and extended onion method
- 10.21105/joss.01601 may be a valid DOI for title: greta: simple and scalable statistical modelling in R
- 10.1613/jair.5371 may be a valid DOI for title: Learning discrete Bayesian networks from continuous data
- 10.3390/e19100555 may be a valid DOI for title: The prior can often only be understood in the context of the likelihood
- 10.1093/ije/dyw341 may be a valid DOI for title: Robust causal inference using directed acyclic graphs: the R package dagitty
INVALID DOIs
- None
:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:
In Figure 1, why is there an edge from the error of the estimates?
Thx for the question @ChristopherLucas. Let me know if my answer below makes sense.
The edge reflects a parent-child relationship -- meaning the probability density for treatment effect $y_i$ is conditional on a given $\sigma_i$. The fact that we know $\sigma_i$ is slightly weird and it is because the famous eight-schools problem is slightly weird (at least in my eyes). The observations, $y_i$, are actually sampled means as opposed to actual student test scores. Hence, the eight observations of treatment effect, $y_i$, are assumed normally distributed around some "true" school-level effect ($\theta_i$) which will vary based on which students from school $i$ are sampled to receive the treatment. The sampling variance is assumed to be known "because the sample sizes in all eight experiments were relatively large" (Gelman et al. 2014). Thus $y_i \sim Normal(\theta_i,\sigma_i)$ where for our generative model $y_i$ and $sigma_i$ are known/observed and $\theta_i$ (i.e. a true school level effect) is the latent parameter of interest.
For the above quote and more description, see Gelman et al. (2014) p. 119-120 (available here: http://www.stat.columbia.edu/~gelman/book/BDA3.pdf) and relevant paragraph shown below:
@editorialbot add @zhiiiyang as reviewer
@zhiiiyang added to the reviewers list!
@flyaflya - this is a great piece of work. Congratulations. Comments to follow:
R code in paper should be edited so it can be copied and pasted as is see https://github.com/flyaflya/causact/issues/45
@terrytangyuan and @flyaflya - I think the paper is ready for acceptance conditional on the issues I have opened being addressed to my satisfaction (see above)
Congratulations again on a great package.
@zhiiiyang . Thx for agreeing to review and I am working on your checklist. Below, I want to address your question about the second author's contribution.
The second author is my PhD advisee who has been working on this project alongside me since 2019. She authored the first draft of our submitted paper and has contributed to the development and thought process behind the causact
package by developing a parallel backend using Julia instead of Tensorflow/Python. Her main code contributions have not been merged as I am not ready to support the Julia backend quite yet (just supporting the greta
backend for now). You can see evidence of her extensive coding efforts here: https://github.com/flyaflya/causact/compare/master...Rosenguyen0411:master.
@joethorley - Thanks a million for all the helpful suggestions. I think I have addressed your concerns and I went ahead and closed all the issues with comments regarding the commits that fix the raised issues. Please have a look to make sure they are all okay. I can reopen any issues that were not resolved properly or need further improvement (and my apologies if I should not have closed them ... not sure of the protocol for whether I close them when I think they are complete or when you say they are resolved satisfactorily).
It is possible I'm misunderstanding the example or the DAG, and maybe @joethorley or @zhiiiyang can weigh in, but I don't understand why there's an edge from the estimated SE. E.G., the treatment effect is not dependent on my estimates of parameters of the sampling distribution of the estimator.
Again, it's possible I'm just misunderstanding the example, but presumably others might as well.
@flyaflya, thank you for addressing my comments. I have encountered a fatal crash when using greta
and the maintainers of greta
are trying to help me debug https://github.com/greta-dev/greta/issues/526#issuecomment-1144470814. I will return the complete review reports once we figure it out.
@flyaflya - thanks for resolving and closing my issues.
I'm am happy with it being published.
@ChristopherLucas My understanding is that the edge is there because the values of the treatment effect depends on the expected treatment effect and the standard deviation of the variation in the treatment effect.
@ChristopherLucas A nearly equivalent representation of the eight-schools problem can be shown like this:
(variable()
represents a flat prior). This picture above exactly matches the description in Gelman et al. (2014) p. 119-120 (available here: http://www.stat.columbia.edu/~gelman/book/BDA3.pdf). If this representation seems more helpful, I can use it in place of the one that is there already?
I am with you that typically the "the treatment effect (y)" should not be "dependent on my estimates of parameters (i.e. sigma) of the sampling distribution of the estimator." The only reason the eight-schools problem looks like this is due to this key assumption articulated in Gelman: "the sampling variances [i.e. the std error of effect] are known, for all practical purposes." Thus, the measured treatment effect is dependent on the assumed-to-be-known parameters of the sampling distribution of the estimator.
I could use a different dataset/model to showcase causact
, but this dataset was used because it commonly appears in the computational Bayesian inference literature and the graphical illustration of the model helps to initiate conversations about assumptions like the conversation we are having here. Examples of eight schools used as demo model:
rstan
Vignette: https://cran.r-project.org/web/packages/rstan/vignettes/rstan.htmlPS: The justification for the simplifiying asumption that the estimated sigma from the data is assumed to be equivalent to the "true" sigma of the sampling distribution is because the treatment effect (y) is actually an observed mean treatment effect over many (>30) individual student performances at each school. Presumably, if you run the experiment with a different set of students, the sampling distribution of their mean treatment effect $y \sim Normal(theta,sigma)$.
@terrytangyuan, I just found out that I can't use this package since it depends on greta
failing to run on an M1 Macγ:(. This has been confirmed by the maintainer. Could you remove me from the reviewer list? Sorry about the inconveniences.
https://github.com/greta-dev/greta/issues/526#issuecomment-1151903419
@zhiiiyang - that is good to know - is there any sense of when this might be fixed?
@editorialbot remove @zhiiiyang from reviewers
@zhiiiyang removed from the reviewers list!
@joethorley the greta maintainer asked me to follow this issue https://github.com/greta-dev/greta/issues/458
@terrytangyuan Even though @zhiiiyang is no longer an official reviewer, her comments/suggestions were much appreciated and there is a new CRAN release of causact
as of today to support adding a vignette. You can see more details about my response to @zhiiiyang 's review here: https://github.com/flyaflya/causact/issues/43#issuecomment-1155419729
Hi all - just wanted to gently check on the progress of this review. @ChristopherLucas does https://github.com/openjournals/joss-reviews/issues/4415#issuecomment-1147543764 help ease your concerns about the presented model in the paper? @terrytangyuan is there any additional outstanding comments/issues that I can help address?
Thanks to all for the help!
@editorialbot assign @arfon as editor
:wave: folks. As @terrytangyuan is stepping down from the JOSS editorial team (thanks for all of your help over the years Yuan!) I'm going to be taking over this review.
Based on my reading of the review thread thus far, it looks like there has been some feedback from @flyaflya in response to @ChristopherLucas' feedback. @ChristopherLucas β could you take another look and see if you're able to complete your review at this stage?
Many thanks!
Assigned! @arfon is now the editor
@ChristopherLucas A nearly equivalent representation of the eight-schools problem can be shown like this: (
variable()
represents a flat prior). This picture above exactly matches the description in Gelman et al. (2014) p. 119-120 (available here: http://www.stat.columbia.edu/~gelman/book/BDA3.pdf). If this representation seems more helpful, I can use it in place of the one that is there already?I am with you that typically the "the treatment effect (y)" should not be "dependent on my estimates of parameters (i.e. sigma) of the sampling distribution of the estimator." The only reason the eight-schools problem looks like this is due to this key assumption articulated in Gelman: "the sampling variances [i.e. the std error of effect] are known, for all practical purposes." Thus, the measured treatment effect is dependent on the assumed-to-be-known parameters of the sampling distribution of the estimator.
I could use a different dataset/model to showcase
causact
, but this dataset was used because it commonly appears in the computational Bayesian inference literature and the graphical illustration of the model helps to initiate conversations about assumptions like the conversation we are having here. Examples of eight schools used as demo model:* Sturtz, S., Ligges, U., & Gelman, A. (2005). R2WinBUGS: a package for running WinBUGS from R. Journal of Statistical software, 12, 1-16 (https://www.jstatsoft.org/index.php/jss/article/view/v012i03/33). * `rstan` Vignette: https://cran.r-project.org/web/packages/rstan/vignettes/rstan.html
PS: The justification for the simplifiying asumption that the estimated sigma from the data is assumed to be equivalent to the "true" sigma of the sampling distribution is because the treatment effect (y) is actually an observed mean treatment effect over many (>30) individual student performances at each school. Presumably, if you run the experiment with a different set of students, the sampling distribution of their mean treatment effect yβΌNormal(theta,sigma).
This all makes sense to me. I still worry that researchers regularly working with DAGs will find it confusing to mix the causal structure with assumptions about the estimator for estimands denoted by that structure. I think the authors know precisely what they mean by the example, and that they mean something sensible, but I do think it will be unclear to many researchers who regularly work with DAGs but not with Bayesian models. I'd probably use a simpler example and perhaps invoke the 8-schools problem elsewhere in the docs, where there is more space to clarify what's going on. E.G., if I were to set up an example of the eight-schools problem, I'd write the estimand in potential outcomes notation, then denote a DAG (or several DAGs) under which a particular estimator returns an estimate of that estimand with some characterizable properties wrt bias/variance. That's probably a bit outside the bounds of a software submission, so I don't think it necessary here, but I do worry that as written many interested readers will be unnecessarily confused.
Roger that @ChristopherLucas. I will switch to a more straight forward example. I too experience some discomfort with 8-schools and I think you are right that more readers will benefit from a different example. I'll revise the manuscript accordingly. Thanks for pushing on this!
Congrats, again, on a great package. I spent some time looking through the docs yesterday - awesome materials, some of which I'll probably use in one of my grad courses.
Thanks @ChristopherLucas for the encouragement. I switched the example model to use another well known dataset without the weird modelling of standard errors. This time I am using a medium complexity version of the chimpanzee example from McElreath's Statistical Rethinking. Experiment discussed here: https://youtu.be/n2aJYtuGu54?t=513
The updated paper is an artifact from the last action and can be downloaded here:
https://github.com/flyaflya/causact/suites/7535235027/artifacts/310574849
This looks great to me. Congrats on the package and paper!
Fixed a few minor typos and the most recent manuscript can be found as the artifact here: https://github.com/flyaflya/causact/actions/runs/2743355474
@arfon, thanks for taking over as editor. I believe the review process is nearing its conclusion (checklists complete and all issues have been addressed), so please let me know what I else I can do to help shepherd things along. Thanks again!
@arfon, thanks for taking over as editor. I believe the review process is nearing its conclusion (checklists complete and all issues have been addressed), so please let me know what I else I can do to help shepherd things along. Thanks again!
I think we're in good shape here. At this point could you make a new release of this software that includes the changes that have resulted from this review. Then, please make an archive of the software in Zenodo/figshare/other service and update this thread with the DOI of the archive? For the Zenodo/figshare archive, please make sure that:
I can then move forward with accepting the submission.
This is great news. Thanks to all!!
@arfon the archived DOI is: 10.5281/zenodo.6949489
A link to the Zenodo archive is here: https://zenodo.org/record/6949489#.Yuf-THbMKUk
@editorialbot set 10.5281/zenodo.6949489 as archive
Done! Archive is now 10.5281/zenodo.6949489
@editorialbot recommend-accept
Attempting dry run of processing paper acceptance...
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):
OK DOIs
- 10.5281/zenodo.3879620 is OK
- 10.18637/jss.v012.i03 is OK
MISSING DOIs
- 10.1201/9781584887218 may be a valid DOI for title: Applied Bayesian hierarchical methods
- 10.21236/ada208388 may be a valid DOI for title: Sampling-based approaches to calculating marginal densities
- 10.1201/b14835-11 may be a valid DOI for title: Strategies for improving MCMC
- 10.21105/joss.01601 may be a valid DOI for title: greta: simple and scalable statistical modelling in R
- 10.1007/978-3-642-21295-6_2 may be a valid DOI for title: Practical probabilistic programming
- 10.1093/ije/dyw341 may be a valid DOI for title: Robust causal inference using directed acyclic graphs: the R package βdagittyβ
INVALID DOIs
- None
:wave: @openjournals/joss-eics, this paper is ready to be accepted and published.
Check final proof :point_right::page_facing_up: Download article
If the paper PDF and the deposit XML files look good in https://github.com/openjournals/joss-papers/pull/3425, then you can now move forward with accepting the submission by compiling again with the command @editorialbot accept
@editorialbot accept
Doing it live! Attempting automated processing of paper acceptance...
π¦π¦π¦ π Tweet for this paper π π¦π¦π¦
π¨π¨π¨ THIS IS NOT A DRILL, YOU HAVE JUST ACCEPTED A PAPER INTO JOSS! π¨π¨π¨
Here's what you must now do:
Any issues? Notify your editorial technical team...
@joethorley, @ChristopherLucas, @zhiiiyang β many thanks for your reviews here and to @terrytangyuan for editing this submission! JOSS relies upon the volunteer effort of people like you and we simply wouldn't be able to do this without you β¨
@flyaflya β your paper is now accepted and published in JOSS :zap::rocket::boom:
Submitting author: !--author-handle-->@flyaflya<!--end-author-handle-- (Adam Fleischhacker) Repository: https://github.com/flyaflya/causact/ Branch with paper.md (empty if default branch): Version: v0.4.1 Editor: !--editor-->@arfon<!--end-editor-- Reviewers: @joethorley, @ChristopherLucas Archive: 10.5281/zenodo.6949489
Status
Status badge code:
Reviewers and authors:
Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)
Reviewer instructions & questions
@joethorley & @ChristopherLucas, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:
The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @terrytangyuan know.
β¨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest β¨
Checklists
π Checklist for @joethorley
π Checklist for @zhiiiyang
π Checklist for @ChristopherLucas