Pulmonary track tables - Githubissues

andkov commented 8 years ago

@ampiccinin , @smhofer , @wibeasley , @GracielaMuniz , @eduggan @seanclouston

The authoritative source of links to the most current reports has moved to ./projects/pulmonary-cognitive

eduggan commented 8 years ago

Hello all,

Thank you for all of your incredible work in putting this unprecedented project together! Now that we have the final tabulated results for the pulmonary data completed, we are working to complete a single, strong paper integrating the systematic review and meta-analysis, co-authored with each of the study investigators that contributed results.

For each study listed below (except ILSE and NUAGE which do not have pulmonary data), please respond by November 1 with the following information.

First, please append the name of your study to the title of this email when you reply.
Include information for the contributing author for that study (you), including name, institution affiliation, and contact information as you would like it to appear on the manuscript.
Write a one sentence description of your study for the methods section that includes the citation to the published cohort profile and any other necessary citations. Please then list the citation details. E.g., STUDY included 5 waves of pulmonary (VO2 max) and cognitive data (blocks, digit span, symbol search, matrix) collected every 5 years in Canadian adults between the ages of 40-70 years (Cohort profile citation).
Provide any statement of personnel and funding acknowledgements that needs to be made on behalf of your study.
Optional - provide any additional information that you have already compiled or think is relevant to the preparation of the manuscript (and is not already on GitHub).

Thank you very much and I am looking forward to hearing from you all soon!

Emily

ampiccinin commented 8 years ago

Thanks @andkov ! Can you please switch out EAS-Word link? It currently points to the phys-phys report.

ampiccinin commented 8 years ago

@andkov - I wonder whether some (or all!) of the jumping around of "Process A" estimates in the different "Process B" pairings is due to different N. How difficult is it to add the N for each model at, e.g., the top of the respective columns?

(I suppose that in some situations with equal N, the N is actually made up of different subsets of individuals, so the best way to do this in the future is to have analysts ensure that they are using the same subset of individuals in each analysis.)

chenkaiwu commented 8 years ago

Hi Emily,

Thanks for putting it together. It is a tremendous amount of work!

I have conducted the analyses using the Health and Retirement Study (HRS). Please find my information and affiliation as follows: Name: Chenkai Wu, MPH, MS Affiliation: School of Biological and Population Health Sciences, Oregon State University, Corvallis, OR, USA Email: wuche@oregonstate.edu

One-paragraph description of the HRS:

The HRS included 3 waves of pulmonary function (peak expiratory flow) and measures of cognitive function (immediate recall, delayed recall, and serial 7's test) collected every 4 years since 2006 among non-institutionalized US adults aged 50-104 years. The HRS is a co-operative agreement between the National Institute on Aging and the University of Michigan and aims to describe changes in life patterns through the retirement transition among US adults by collecting information about their health conditions, family network, social relations, financial situation and employment status (Juster & Suzman, 1995). Ethical approval was obtained from the University of Michigan Institutional Review Board. Further details about the recruitment strategies, design and sampling approaches of the HRS have been documented elsewhere (Heeringa & Connor, 1995; Crimmins et al., 2008).

References: Juster FT, Suzman R. An overview of the Health and Retirement Study. J Human Resources 1995:S7–S56.

Heeringa SG, Connor JH. Technical description of the health and retirement survey sample design. Ann Arbor: University of Michigan, 1995.

Crimmins E, Guyer H, Langa K, et al. HRS Documentation Report. Ann Arbor, MI: Institute for Social Research, University of Michigan; 2008. Documentation of physical measures, anthropometrics and blood pressure in the health and retirement study. Report No.: DR-011.

Please include following two funding acknowledgements for the HRS (required by publications using data from the HRS):

National Institute on Aging: U01-AG009740
A pilot grant from the Michigan Center on the Demography of Aging: P30-AG012846

I have no conflicts of interest to declare.

Thanks again for your great work. I look forward to reviewing the manuscript when it is ready.

Please let me know if you have any questions or need any help.

Best,

Chenkai

andkov commented 8 years ago

Can you please switch out EAS-Word link? It currently points to the phys-phys report.

I don't remember doing it, but it seems that now it points to the correct report. @ampiccinin , please let me know if you still encounter this issue. I think I might have corrected it and forgot about it.

andkov commented 8 years ago

@ampiccinin

I wonder whether some (or all!) of the jumping around of "Process A" estimates in the different "Process B" pairings is due to different N. How difficult is it to add the N for each model at, e.g., the top of the respective columns?

The N is already there , on the 6th row from the bottom. Or did you mean something else? Changing the location is certainly possible, but a bit messy.

(I suppose that in some situations with equal N, the N is actually made up of different subsets of individuals, so the best way to do this in the future is to have analysts ensure that they are using the same subset of individuals in each analysis.)

Yes, this is certainly should be remembered. However, I'm not sure about ensuring the same subset. Wouldn't this further reduce sample size? It would mean that only those individuals who have data on ALL measures would be included into the analysis. So if a person has data for, say, grip, block, digits_f, digits_b, raven, but **not** onfluency` - the case would be discarded and not used for the measure for which the data exists. The more the measures in the set, the more likely that some of the data would be missing. Meaning that those studies who bring in many measures would be "penalized" by reduced sample size. I'm troubled by this and can't think of a solution.

ampiccinin commented 8 years ago

@andkov

oops - sorry I had somehow missed seeing the N. Please disregard my comment.

I agree that we do not want to reduce sample size unnecessarily, and we would then risk presenting analyses based on a more select sample. However, it can become very difficult to interpret output that we tend to assume refers to the same people when in fact it does not. I was referring specifically for the apparent inconsistency between estimates of the same submodel (e.g., pulmonary function trajectories) paired with alternate cognitive submodels. Whereas I would have assumed that the pulmonary submodel should produce the same estimates regardless of the paired cognitive outcome, we sometimes find different estimates, so I wonder the extent to which this is driven by differences in sample/size across the various bivariate pairings. There is some benefit to avoiding having to consider multiple estimations of the same submodel (the whole "process A" part of your tables). There is basically no universal solution, but it is worth noting whether an N difference could explain the unexpected (by me) variation among estimates. (Note the MAP output, which has identical N across models, seems very consistent across the different pairings (sd=0).)

andkov commented 8 years ago

@ampiccinin

There is basically no universal solution, but it is worth noting whether an N difference could explain the unexpected (by me) variation among estimates.

good point. I agree, in case of observing jumping coefficients, restricting data to the same subset will demonstrate that the difference is not due to sample size. Generation 2 scripts that we developed with @wibeasley will make this much easier in the future.

I was referring specifically for the apparent inconsistency between estimates of the same submodel (e.g., pulmonary function trajectories)

Yes, I saw some inconsistencies too, but nothing stroke me as too drastic. Besides, I just glanced at all tables (for females) and i noticed that only OCTO and SATSA have minor discrepancies in N. The rest of the studies have the same N across aehplus models, while still minor fluctuation of the estimates of the pulmonary submodel is observed. So it seems to me that these fluctuations cannot be attributed to the different subsets. (unless the same N is coincidental, but persons are in fact different, which doesn't seem likely given such consistencies across models and studies).

MarcusPraetorius commented 8 years ago

Hi Emily See attached file Best Marcus

andkov commented 8 years ago

@andreazammit @ampiccinin

The new EAS files for PEK has been processed and incorporated into the seed-report and the correlation table. Let me know if you spot any funny things in the output.

ampiccinin commented 7 years ago

@andkov : odd thing - SATSA male grip-Information in the seed report - somehow the female output is repeated here, rather than the male output

andkov commented 7 years ago

@ampiccinin, thanks for pointing out, I will take a look

ampiccinin commented 7 years ago

@andkov - quick question: I see that Lewina uploaded NAS yesterday. When I download seed-NAS, however, the date inside is 11-4. I think it is because the new files are in NAS, not NAS\physical-cognitive. Is this correct, and can I move them there without breaking anything? the filenames are different (my fault) - I hope correct according to the newer naming convention? Thanks! A

ampiccinin commented 7 years ago

@andkov - ditto for new EAS and MAP. Please let me know when the seed reports point to the new files so I can review them. Thanks! A

andkov commented 7 years ago

@ampiccinin, didn't have a chance to process them yet. Don't worry about moving files, I will take care of them Tomorrow and will notify you about the updated seed reports. Should be no trouble. Cheers!

ampiccinin commented 7 years ago

@andkov : NAS smoking variable (smkevr) not showing up in seed report.

it appears in the GitHub output files. Please update seed report to include. Thanks!

ampiccinin commented 7 years ago

@andkov : I notice that information, synonyms and psif are in GitHub, but not in the seed report. Can you please correct this? Are there any others we have missed? Thanks!

ampiccinin commented 7 years ago

@annierobi : can you please identify why the slope variance and especially the SE of the slope variance for OCTO PEF are so much lower in the digit_f model than in any of the others? The sample sizes for these models do not differ by more than 5 people, yet the estimates are quite different. thanks!

a Var (Slope) 37.14 (18.63) .05 38.96 (23.14) .09 27.92 (11.78) .02 35.05 (21.25) .10 39.41 (16.27) .01

annierobi commented 7 years ago

I just went over the output files to compare and make sure i hadn't missed anything. I see no mistakes in the input file. It is identical to all the other files and uses the same data set. This was one of the problematic models where the correlations and the covariances didn't agree. That is, the slope-slope corr is highly significant but not the covariance. Let me think about it some more.

Thanks

Annie

On Wed, Dec 14, 2016 at 5:55 PM, Andrea Piccinin notifications@github.com wrote:

@annierobi https://github.com/annierobi : can you please identify why the slope variance and especially the SE of the slope variance for OCTO PEF are so much lower in the digit_f model than in any of the others? The sample sizes for these models do not differ by more than 5 people, yet the estimates are quite different. thanks!

a Var (Slope) 37.14 (18.63) .05 38.96 (23.14) .09 27.92 (11.78) .02 35.05 (21.25) .10 39.41 (16.27) .01

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/IALSA/IALSA-2015-Portland/issues/152#issuecomment-267182938, or mute the thread https://github.com/notifications/unsubscribe-auth/AKow3aTb3Xl0JdFL6KswkVUOxxM5Eb07ks5rIHPMgaJpZM4Ka5_f .

andkov commented 7 years ago

Thanks, @ampiccinin. Noted.

Sent from my iPhone, please excuse typing and T9 errors.

On Dec 14, 2016, at 2:51 PM, Andrea Piccinin notifications@github.com wrote:

@andkov : I notice that information, synonyms and psif are in GitHub, but not in the seed report. Can you please correct this? Are there any others we have missed? Thanks!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

andkov commented 7 years ago

@ampiccinin

I notice that information, synonyms and psif are in GitHub, but not in the seed report. Can you please correct this? ( i presume you imply the OCTO study)

yes, thanks for catching this. @annierobi and I were sorting the outputs in order to create an authoritative source of valid models, that is the models that should be included into the analysis. Over time, some of the model had to be re-run, which created old and new version of the same models. The scripts choke up when encountering a model that is called the same, so we kept the valid models in the folder ./studies/octo/physical-cognitive/Final/ and, not surprisingly, missed a few as were assembling the "final" list of models. My fear of deleting any models is to blame. Let me think about how I can sort it out.

ampiccinin commented 7 years ago

@andkov : is there a way you can alert me (and @eduggan and @andreazammit , though I am happy to review first in case I find new challenges) when the issues with various models and whether they are on GitHub have been resolved?

(I'm thinking EAS [expecting new ones today], OCTO, LASA [have you and @annierobi wrapped this up?], MAP [have you and @casslbrown talked and are the final models represented in the seed report?])

We would all like to wrap this up ASAP, so I'm trying to prioritise reviewing the output. It would be great to know as soon as the new seed reports are available as they are updated.

Many thanks!

ampiccinin commented 7 years ago

just a comment: Web view in Word is much better than html for studies with many variables.

andkov commented 7 years ago

I agree. I was pleasantly surprised by this feature myself. The only downside is it's limited to 22 inches. Which I hope will never be relevant.

ampiccinin commented 7 years ago

@andkov - Please drop fev100 results from the seed tables. Now that I see them all together, I see that they are either completely redundant to the original fev output or provide contradictory results within the fev100 output (i.e., n.s. covSS, but sig corrSS). Had the covSS and corrSS corresponded, I would have believed that the problem with the original fev output was the metric, and we would have used the fev100 output, but since the fev100 output itself is problematic, it does not solve anything.

andkov commented 7 years ago

@ampiccinin , noted. will do.

ampiccinin commented 7 years ago

@andkov - Please drop the computed correlations and CIs columns from the final seed reports since they were included when we were trying to check for consistencies. Similarly, the covariance columns can now be dropped (IMHO) since, except for three OCTO variables (dsf, mir, block) the cov and corr conclusions are identical. In two of these, the p-value for the cov is 0.06, and for the other it is 0.20. I’m not sure what is going on, but I think we can live with this. The table will therefore be 6 columns narrower.

Can we just add the finalized seed reports, labeled as publication reports (or something), so we can keep the current reports as documentation that we did these checks?

Many thanks!!

ampiccinin commented 7 years ago

@andkov - also, please drop HRS TICS in the final version. Chenkai said there was an administration issue with it so did not run the updated analysis for it. (or we can do this by hand...)

andkov commented 7 years ago

@ampiccinin , @eduggan

I will 1) drop the computed correlations, covariances, and CIs columns, but I’ll keep the full copy of those report just in case we need to go back. I will explain where I will store each version. i agree with this decision. instead of exploring now, we need to emphasize what information is important by channeling reader's attention. 2) drop fev100 from the tables, but again, I’ll keep a copy for the records and will indicate where it is stored 3) drop HRS tics measure from the seeds and correlation tables. 4) be happy to correct the inconsistencies in the domain names. Please let me know what how I need to adjust.

Please let me know if you see any other adjustments I can make to improve the manuscript!

ampiccinin commented 7 years ago

@andkov - my apologies - when I was writing about dropping columns from seed reports, what I meant was drop columns from the correlation reports. I think you knew what I meant, but just to be sure I am correcting myself here. Perhaps I should go back and edit my earlier comment as well to eliminate any possible confusion in the future?

ampiccinin commented 7 years ago

@andkov - also -

1) final row of the pulmonary correlation table is labeled "NA" as the domain and variable for OCTO (for men and women) - (yet all the OCTO variables seem already to be in the table?)

2) perhaps your intention was equal opportunity, but it probably makes sense to have one of women or men first in both the seed and correlation reports (currently women are first in seed, men in correlation). Not a big deal at all. Just a comment about symmetry and expectations.

andkov commented 7 years ago

@ampiccinin Thanks for clarifying. I understood it that the changes should be applied to both seed and correlation reports. My current understanding is the seed reports will not be added to manuscripts, but will be available as online appendices. So perhaps they should contain all indices for completeness? I can create two versions, this is not difficult technically.

ampiccinin commented 7 years ago

@andkov - yes - Seed reports go online. For now, I think they are OK as is.

andkov commented 7 years ago

@ampiccinin , re:

final row of the pulmonary correlation table is labeled "NA" as the domain and variable for OCTO (for men and women) - (yet all the OCTO variables seem already to be in the table?)

This is due to the lack of proper labels for the measure figure, which really should have been called figurelogic as in the previous (manual) model.

NOTE to SELF: There are four different measure involving figure in some way:
so the label figure is ambiguous, avoid in the future. Address the issue in `./manipulation/estimation/0-prepare-data.R

andkov commented 7 years ago

@ampiccinin , The seeds and correlations has been updated. Here's what's new:

Two forms of the correlation report are available now:
- short : contains only estimated correlations
- full : also contains raw variances, computed correlations, and confidence intervals
The authoritative source of links to the most current reports has moved to ./projects/pulmonary-cognitive, this issue was just a temporary placeholder for such a page. Similar pages will be added to all tracks (i just wanted to make sure pulmonary can proceed).
measure tics in HRS has been removed from seed and correlations reports
minor issue resolved in OCTO : measure figure is corrected in reports.

Sadly, I have overlooked you point about the order of males and females. This will get corrected next time I update. Which should be soon (considering that automatic runs for LASA are in the works). Besides LASA, however, this is ready for inspection.

Anything else I'm forgetting to implement?

ampiccinin commented 7 years ago

@andkov - For some reason GitHub is telling me that the correlationtable is not found?

andkov commented 7 years ago

@ampiccinin , yes, the serving of the reports have been moved to a more permanent location: ./projects/pulmonary-cognitive. This issue is just a place holder. I'd a disclaimer to the top.

IALSA / IALSA-2015-Portland

Pulmonary track tables #152