Open andkov opened 8 years ago
Hello all,
Thank you for all of your incredible work in putting this unprecedented project together! Now that we have the final tabulated results for the pulmonary data completed, we are working to complete a single, strong paper integrating the systematic review and meta-analysis, co-authored with each of the study investigators that contributed results.
For each study listed below (except ILSE and NUAGE which do not have pulmonary data), please respond by November 1 with the following information.
Thank you very much and I am looking forward to hearing from you all soon!
Emily
Thanks @andkov ! Can you please switch out EAS-Word link? It currently points to the phys-phys report.
@andkov - I wonder whether some (or all!) of the jumping around of "Process A" estimates in the different "Process B" pairings is due to different N. How difficult is it to add the N for each model at, e.g., the top of the respective columns?
(I suppose that in some situations with equal N, the N is actually made up of different subsets of individuals, so the best way to do this in the future is to have analysts ensure that they are using the same subset of individuals in each analysis.)
Hi Emily,
Thanks for putting it together. It is a tremendous amount of work!
I have conducted the analyses using the Health and Retirement Study (HRS). Please find my information and affiliation as follows: Name: Chenkai Wu, MPH, MS Affiliation: School of Biological and Population Health Sciences, Oregon State University, Corvallis, OR, USA Email: wuche@oregonstate.edu
One-paragraph description of the HRS:
The HRS included 3 waves of pulmonary function (peak expiratory flow) and measures of cognitive function (immediate recall, delayed recall, and serial 7's test) collected every 4 years since 2006 among non-institutionalized US adults aged 50-104 years. The HRS is a co-operative agreement between the National Institute on Aging and the University of Michigan and aims to describe changes in life patterns through the retirement transition among US adults by collecting information about their health conditions, family network, social relations, financial situation and employment status (Juster & Suzman, 1995). Ethical approval was obtained from the University of Michigan Institutional Review Board. Further details about the recruitment strategies, design and sampling approaches of the HRS have been documented elsewhere (Heeringa & Connor, 1995; Crimmins et al., 2008).
References: Juster FT, Suzman R. An overview of the Health and Retirement Study. J Human Resources 1995:S7–S56.
Heeringa SG, Connor JH. Technical description of the health and retirement survey sample design. Ann Arbor: University of Michigan, 1995.
Crimmins E, Guyer H, Langa K, et al. HRS Documentation Report. Ann Arbor, MI: Institute for Social Research, University of Michigan; 2008. Documentation of physical measures, anthropometrics and blood pressure in the health and retirement study. Report No.: DR-011.
Please include following two funding acknowledgements for the HRS (required by publications using data from the HRS):
I have no conflicts of interest to declare.
Thanks again for your great work. I look forward to reviewing the manuscript when it is ready.
Please let me know if you have any questions or need any help.
Best,
Chenkai
Can you please switch out EAS-Word link? It currently points to the phys-phys report.
I don't remember doing it, but it seems that now it points to the correct report. @ampiccinin , please let me know if you still encounter this issue. I think I might have corrected it and forgot about it.
@ampiccinin
I wonder whether some (or all!) of the jumping around of "Process A" estimates in the different "Process B" pairings is due to different N. How difficult is it to add the N for each model at, e.g., the top of the respective columns?
The N
is already there , on the 6th row from the bottom. Or did you mean something else? Changing the location is certainly possible, but a bit messy.
(I suppose that in some situations with equal N, the N is actually made up of different subsets of individuals, so the best way to do this in the future is to have analysts ensure that they are using the same subset of individuals in each analysis.)
Yes, this is certainly should be remembered. However, I'm not sure about ensuring the same subset. Wouldn't this further reduce sample size? It would mean that only those individuals who have data on ALL measures would be included into the analysis. So if a person has data for, say, grip
, block
, digits_f
, digits_b
, raven
, but **not** on
fluency` - the case would be discarded and not used for the measure for which the data exists. The more the measures in the set, the more likely that some of the data would be missing. Meaning that those studies who bring in many measures would be "penalized" by reduced sample size. I'm troubled by this and can't think of a solution.
@andkov
oops - sorry I had somehow missed seeing the N. Please disregard my comment.
I agree that we do not want to reduce sample size unnecessarily, and we would then risk presenting analyses based on a more select sample. However, it can become very difficult to interpret output that we tend to assume refers to the same people when in fact it does not. I was referring specifically for the apparent inconsistency between estimates of the same submodel (e.g., pulmonary function trajectories) paired with alternate cognitive submodels. Whereas I would have assumed that the pulmonary submodel should produce the same estimates regardless of the paired cognitive outcome, we sometimes find different estimates, so I wonder the extent to which this is driven by differences in sample/size across the various bivariate pairings. There is some benefit to avoiding having to consider multiple estimations of the same submodel (the whole "process A" part of your tables). There is basically no universal solution, but it is worth noting whether an N difference could explain the unexpected (by me) variation among estimates. (Note the MAP output, which has identical N across models, seems very consistent across the different pairings (sd=0).)
@ampiccinin
There is basically no universal solution, but it is worth noting whether an N difference could explain the unexpected (by me) variation among estimates.
good point. I agree, in case of observing jumping coefficients, restricting data to the same subset will demonstrate that the difference is not due to sample size. Generation 2 scripts that we developed with @wibeasley will make this much easier in the future.
I was referring specifically for the apparent inconsistency between estimates of the same submodel (e.g., pulmonary function trajectories)
Yes, I saw some inconsistencies too, but nothing stroke me as too drastic. Besides, I just glanced at all tables (for females) and i noticed that only OCTO and SATSA have minor discrepancies in N
. The rest of the studies have the same N
across aehplus
models, while still minor fluctuation of the estimates of the pulmonary submodel is observed. So it seems to me that these fluctuations cannot be attributed to the different subsets. (unless the same N
is coincidental, but persons are in fact different, which doesn't seem likely given such consistencies across models and studies).
Hi Emily See attached file Best Marcus
@andreazammit @ampiccinin
The new EAS files for PEK has been processed and incorporated into the seed-report and the correlation table. Let me know if you spot any funny things in the output.
@andkov : odd thing - SATSA male grip-Information in the seed report - somehow the female output is repeated here, rather than the male output
@ampiccinin, thanks for pointing out, I will take a look
@andkov - quick question: I see that Lewina uploaded NAS yesterday. When I download seed-NAS, however, the date inside is 11-4. I think it is because the new files are in NAS, not NAS\physical-cognitive. Is this correct, and can I move them there without breaking anything? the filenames are different (my fault) - I hope correct according to the newer naming convention? Thanks! A
@andkov - ditto for new EAS and MAP. Please let me know when the seed reports point to the new files so I can review them. Thanks! A
@ampiccinin, didn't have a chance to process them yet. Don't worry about moving files, I will take care of them Tomorrow and will notify you about the updated seed reports. Should be no trouble. Cheers!
@andkov : NAS smoking variable (smkevr) not showing up in seed report.
it appears in the GitHub output files. Please update seed report to include. Thanks!
@andkov : I notice that information, synonyms and psif are in GitHub, but not in the seed report. Can you please correct this? Are there any others we have missed? Thanks!
@annierobi : can you please identify why the slope variance and especially the SE of the slope variance for OCTO PEF are so much lower in the digit_f model than in any of the others? The sample sizes for these models do not differ by more than 5 people, yet the estimates are quite different. thanks!
a Var (Slope) 37.14 (18.63) .05 38.96 (23.14) .09 27.92 (11.78) .02 35.05 (21.25) .10 39.41 (16.27) .01
I just went over the output files to compare and make sure i hadn't missed anything. I see no mistakes in the input file. It is identical to all the other files and uses the same data set. This was one of the problematic models where the correlations and the covariances didn't agree. That is, the slope-slope corr is highly significant but not the covariance. Let me think about it some more.
Thanks
Annie
On Wed, Dec 14, 2016 at 5:55 PM, Andrea Piccinin notifications@github.com wrote:
@annierobi https://github.com/annierobi : can you please identify why the slope variance and especially the SE of the slope variance for OCTO PEF are so much lower in the digit_f model than in any of the others? The sample sizes for these models do not differ by more than 5 people, yet the estimates are quite different. thanks!
a Var (Slope) 37.14 (18.63) .05 38.96 (23.14) .09 27.92 (11.78) .02 35.05 (21.25) .10 39.41 (16.27) .01
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/IALSA/IALSA-2015-Portland/issues/152#issuecomment-267182938, or mute the thread https://github.com/notifications/unsubscribe-auth/AKow3aTb3Xl0JdFL6KswkVUOxxM5Eb07ks5rIHPMgaJpZM4Ka5_f .
Thanks, @ampiccinin. Noted.
Sent from my iPhone, please excuse typing and T9 errors.
On Dec 14, 2016, at 2:51 PM, Andrea Piccinin notifications@github.com wrote:
@andkov : I notice that information, synonyms and psif are in GitHub, but not in the seed report. Can you please correct this? Are there any others we have missed? Thanks!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
@ampiccinin
I notice that information, synonyms and psif are in GitHub, but not in the seed report. Can you please correct this? ( i presume you imply the OCTO study)
yes, thanks for catching this. @annierobi and I were sorting the outputs in order to create an authoritative source of valid models, that is the models that should be included into the analysis. Over time, some of the model had to be re-run, which created old and new version of the same models. The scripts choke up when encountering a model that is called the same, so we kept the valid models in the folder ./studies/octo/physical-cognitive/Final/
and, not surprisingly, missed a few as were assembling the "final" list of models. My fear of deleting any models is to blame. Let me think about how I can sort it out.
@andkov : is there a way you can alert me (and @eduggan and @andreazammit , though I am happy to review first in case I find new challenges) when the issues with various models and whether they are on GitHub have been resolved?
(I'm thinking EAS [expecting new ones today], OCTO, LASA [have you and @annierobi wrapped this up?], MAP [have you and @casslbrown talked and are the final models represented in the seed report?])
We would all like to wrap this up ASAP, so I'm trying to prioritise reviewing the output. It would be great to know as soon as the new seed reports are available as they are updated.
Many thanks!
just a comment: Web view in Word is much better than html for studies with many variables.
I agree. I was pleasantly surprised by this feature myself. The only downside is it's limited to 22 inches. Which I hope will never be relevant.
@andkov - Please drop fev100 results from the seed tables. Now that I see them all together, I see that they are either completely redundant to the original fev output or provide contradictory results within the fev100 output (i.e., n.s. covSS, but sig corrSS). Had the covSS and corrSS corresponded, I would have believed that the problem with the original fev output was the metric, and we would have used the fev100 output, but since the fev100 output itself is problematic, it does not solve anything.
@ampiccinin , noted. will do.
@andkov - Please drop the computed correlations and CIs columns from the final seed reports since they were included when we were trying to check for consistencies. Similarly, the covariance columns can now be dropped (IMHO) since, except for three OCTO variables (dsf, mir, block) the cov and corr conclusions are identical. In two of these, the p-value for the cov is 0.06, and for the other it is 0.20. I’m not sure what is going on, but I think we can live with this. The table will therefore be 6 columns narrower.
Can we just add the finalized seed reports, labeled as publication reports (or something), so we can keep the current reports as documentation that we did these checks?
Many thanks!!
@andkov - also, please drop HRS TICS in the final version. Chenkai said there was an administration issue with it so did not run the updated analysis for it. (or we can do this by hand...)
@ampiccinin , @eduggan
I will
1) drop the computed correlations, covariances, and CIs columns, but I’ll keep the full copy of those report just in case we need to go back. I will explain where I will store each version. i agree with this decision. instead of exploring now, we need to emphasize what information is important by channeling reader's attention.
2) drop fev100 from the tables, but again, I’ll keep a copy for the records and will indicate where it is stored
3) drop HRS tics
measure from the seeds and correlation tables.
4) be happy to correct the inconsistencies in the domain names. Please let me know what how I need to adjust.
Please let me know if you see any other adjustments I can make to improve the manuscript!
@andkov - my apologies - when I was writing about dropping columns from seed reports, what I meant was drop columns from the correlation reports. I think you knew what I meant, but just to be sure I am correcting myself here. Perhaps I should go back and edit my earlier comment as well to eliminate any possible confusion in the future?
@andkov - also -
1) final row of the pulmonary correlation table is labeled "NA" as the domain and variable for OCTO (for men and women) - (yet all the OCTO variables seem already to be in the table?)
2) perhaps your intention was equal opportunity, but it probably makes sense to have one of women or men first in both the seed and correlation reports (currently women are first in seed, men in correlation). Not a big deal at all. Just a comment about symmetry and expectations.
@ampiccinin Thanks for clarifying. I understood it that the changes should be applied to both seed and correlation reports. My current understanding is the seed reports will not be added to manuscripts, but will be available as online appendices. So perhaps they should contain all indices for completeness? I can create two versions, this is not difficult technically.
@andkov - yes - Seed reports go online. For now, I think they are OK as is.
@ampiccinin , re:
final row of the pulmonary correlation table is labeled "NA" as the domain and variable for OCTO (for men and women) - (yet all the OCTO variables seem already to be in the table?)
This is due to the lack of proper labels for the measure figure
, which really should have been called figurelogic
as in the previous (manual) model.
NOTE to SELF: There are four different measure involving figure in some way:
so the label figure
is ambiguous, avoid in the future. Address the issue in `./manipulation/estimation/0-prepare-data.R
@ampiccinin , The seeds and correlations has been updated. Here's what's new:
./projects/pulmonary-cognitive
, this issue was just a temporary placeholder for such a page. Similar pages will be added to all tracks (i just wanted to make sure pulmonary can proceed). tics
in HRS has been removed from seed and correlations reportsfigure
is corrected in reports.Sadly, I have overlooked you point about the order of males and females. This will get corrected next time I update. Which should be soon (considering that automatic runs for LASA are in the works). Besides LASA, however, this is ready for inspection.
Anything else I'm forgetting to implement?
@andkov - For some reason GitHub is telling me that the correlationtable is not found?
@ampiccinin ,
yes, the serving of the reports have been moved to a more permanent location: ./projects/pulmonary-cognitive
. This issue is just a place holder. I'd a disclaimer to the top.
@ampiccinin , @smhofer , @wibeasley , @GracielaMuniz , @eduggan @seanclouston
The authoritative source of links to the most current reports has moved to
./projects/pulmonary-cognitive