Closed maia-sh closed 3 years ago
How do we want to handle papers, e.g. https://github.com/TomHardwicke/Journal-Statistical-Guidance-2019/blob/clean-data/data/processed/d_coding.csv#L60
"EQUATOR, ARRIVE, REMARK, STARD, MOOSE, PRISMA, STROBE, STREGA, BRISQ, Tumor marker studies (Simon et al., 2009), Rodent model studies (Hollingshead, 2008), Microarray-based studies for clinical outcomes, Table 3 in Dupuy & Simon, 2007)"
How do we want to handle papers, e.g. https://github.com/TomHardwicke/Journal-Statistical-Guidance-2019/blob/clean-data/data/processed/d_coding.csv#L60
"EQUATOR, ARRIVE, REMARK, STARD, MOOSE, PRISMA, STROBE, STREGA, BRISQ, Tumor marker studies (Simon et al., 2009), Rodent model studies (Hollingshead, 2008), Microarray-based studies for clinical outcomes, Table 3 in Dupuy & Simon, 2007)"
Is there a reason to handle them differently to reporting guidelines? Probably could be shortened to Author + date. Though other info (e.g., table within paper) may need to be retained too e.g., "Dupuy & Simon (2007; Table 3)".
In general perhaps it would be useful to have a lookup table somewhere that has one column for the external guideline identifier and another column for a direct link (e.g., doi) for that external guideline? And potentially a third 'notes' column for any other useful info, like the study design type the guidelines relate to.
Agreed that it makes sense to keep them in. Also, we're going to need to code the external guidelines for each practice (unless you already did that?) so we could just combine the lookup table details in there
Some remaining open questions. As I was coalescing external guidances, I looked back at instructions to authors when I could easily figure it out, but here are a few I wasn't sure about and left as is.
comments inline below:
Some remaining open questions. As I was coalescing external guidances, I looked back at instructions to authors when I could easily figure it out, but here are a few I wasn't sure about and left as is.
- Psychological Methods: "apa manual" - same as apa jars or does this mean the style guide? Currently left separate.
Denes was first coder on this and he noted that there is statistical guidance in the APA manual, I'm not sure if he means JARs or something separate. So I'd code as APA manual for now and we will check it out.
- Value in Health. "ispor-smdm" same "ispor" or extension (i.e., code separately)? Currently left separate.
The specific reference is to this: https://pubmed.ncbi.nlm.nih.gov/22990088/
- Embo Journal STAMPL vs. SAMPL? Currently left separate.
Its a typo on the journal's website that carried over into our coding - I'm pretty sure they mean SAMPL so I've updated our data in raw and primary
- Developmental Cell "nih" one of "nih principles and guidelines for reporting preclinical research" or "nlm research reporting guidelines and initiatives (https://www.nlm.nih.gov/services/research_report_guide.html)"? Currently left separate.
The specific link is to https://www.nih.gov/research-training/rigor-reproducibility/principles-guidelines-reporting-preclinical-research
- Journal of the Academy of Nutrition and Dietetics: "some papers related to statistics are mentioned". Currently recoded to NA, but probably need to go back to instructions for authors to add the paper references.
I've added the papers and updated our data in raw and primary
- Frontiers in Microbiology "https://www.frontiersin.org/about/author-guidelines" and "https://www.frontiersin.org/about/policies-and-publication-ethics". Publisher-level guidance but only mentioned for one Frontiers journal is our sample, i.e., nnot listed for FRONTIERS IN ECOLOGY AND THE ENVIRONMENT. Currently left as is.
Good catch.
Are there actually any statistical guidelines at https://www.frontiersin.org/about/author-guidelines ? I don't see any ... I see statistical guidance at https://www.frontiersin.org/about/policies-and-publication-ethics
So for both journals I have changed external guidelines to https://www.frontiersin.org/about/policies-and-publication-ethics and updated our data in raw and primary.
Note that for Frontiers in Ecology and Environment, this means the journal has gone from being coded as no statistical guidance to having statistical guidance. I've put you as first coder and me as second coder so please check you are ok with this classification.
- Developmental Cell "nih" one of "nih principles and guidelines for reporting preclinical research" or "nlm research reporting guidelines and initiatives (https://www.nlm.nih.gov/services/research_report_guide.html)"? Currently left separate.
The specific link is to https://www.nih.gov/research-training/rigor-reproducibility/principles-guidelines-reporting-preclinical-research
Made the change to the nih preclinical guidelines in cleaning
- Journal of the Academy of Nutrition and Dietetics: "some papers related to statistics are mentioned". Currently recoded to NA, but probably need to go back to instructions for authors to add the paper references.
I've added the papers and updated our data in raw and primary
for some reason, i'm still getting "some papers related to statistics are mentioned" but i understand you removed from raw and replaced with the paper references. can you check your end too?
- Frontiers in Microbiology "https://www.frontiersin.org/about/author-guidelines" and "https://www.frontiersin.org/about/policies-and-publication-ethics". Publisher-level guidance but only mentioned for one Frontiers journal is our sample, i.e., nnot listed for FRONTIERS IN ECOLOGY AND THE ENVIRONMENT. Currently left as is.
Good catch.
Are there actually any statistical guidelines at https://www.frontiersin.org/about/author-guidelines ? I don't see any ... I see statistical guidance at https://www.frontiersin.org/about/policies-and-publication-ethics
So for both journals I have changed external guidelines to https://www.frontiersin.org/about/policies-and-publication-ethics and updated our data in raw and primary.
Note that for Frontiers in Ecology and Environment, this means the journal has gone from being coded as no statistical guidance to having statistical guidance. I've put you as first coder and me as second coder so please check you are ok with this classification.
yes i agree. i didn't check the permacc page for frontiers in e and e so trusting you that it mentions https://www.frontiersin.org/about/policies-and-publication-ethics as well. but assuming it does, agree it should be recoded as external guidance true
- Journal of the Academy of Nutrition and Dietetics: "some papers related to statistics are mentioned". Currently recoded to NA, but probably need to go back to instructions for authors to add the paper references.
I've added the papers and updated our data in raw and primary
for some reason, i'm still getting "some papers related to statistics are mentioned" but i understand you removed from raw and replaced with the paper references. can you check your end too?
oops my bad. Should be fixed now. Updated data
Hi @maia-sh
I've reviewed the coding of all of the Nature affiliated journals, and they all refer to the same four sets of guidelines - Nature Life Sciences Reporting, Nature Photovotaic Reporting, Nature Lasing Reporting, and Nature Editorial Checklist. Denes and I coded NLSR and I've just done the others here. I they contain minimal statistical guidance, but would you be able to second code? It should be quick. The link to the documents is in column C (note you may need to open with Adobe Acrobat on your computer to view).
I've also created a row for 'Nature consolidated' which captures guidance from all four of the above. For each Nature journal we can then just apply the guidance in this row. Perhaps you could double check that row for accuracy too?
Finally there's an entry for Frontiers if you wouldn't mind second coding that - see section 2.5 here
Cheers! Tom
Hi Tom, just a quick note to say I saw this message. It's a bit hectic before my co-authors leaves on holiday on Friday, so I may only get to this then. Does that still work?
yeah no worries! I just pushed a major update on the preliminary analyses to the repo. Still some work to do, but I'd be interested to hear your thoughts on some of the graphs when you get a chance to look
Hi @TomHardwicke, I've double coded the 4 guidances. No substantial differences. I had one note and found some additional guidance references (though not necessarily statistical, but in line with the broader scope we've been coding). Those cells are highlighted in blue. I'll also take a look at the graphs.
Great thanks! Regarding these additional guidelines you recorded:
COPE, ICMJE, ARRIVE, International Association of Veterinary Editors guidelines, Declaration of Helsinki, Transparency and Openness (TOP) guidelines
I'd suggest that only ARRIVE actually contains any statistical guidance directed at authors; the rest are editor/journal guidelines - what do you think?
I'd suggest that only ARRIVE actually contains any statistical guidance directed at authors; the rest are editor/journal guidelines - what do you think?
fully agree (though i haven't formally coded the other guidelines). I just believe we did capture ICMJE and COPE elsewhere, so I coded for internal consistency
Ah I see. Thanks for flagging that. I think ICMJE does have author-facing guidelines (http://www.icmje.org/icmje-recommendations.pdf) but I don't think that's what the Frontiers docs are referring to. COPE I'm not aware of any author facing guidelines so I think that needs to be removed from previous agreed upon coding - I think COPE has only been recorded for journals that you and I coded - are you OK with the decision to remove?
Agreed, ICMJE is referred to for authorship, not statistics, so ok to remove. Also COPE if removing elsewhere
Tom,
I took a look through the preliminary analysis and here are some thoughts. Let me know if anything is unclear/ you want to discuss further.
Overall, I think it captures the points we previously discussed and what came to my attention in data prep as well. I especially like the dot plot with text labels for comparing discipline rates! It's easily interpretable as well as pretty. I have much to learn on this front. I know I would have made some very boring bar plots.
Looking at the rmarkdown, I saw you had trouble with using summary stats from the gtsummary table. This was my bad - I actually had this issue before and posted an issue and the devs kindly fixed it...but you need the dev version since it's not on CRAN yet. No need to make any changes but just so you have it, I made a demo in a separate branch: https://github.com/TomHardwicke/Journal-Statistical-Guidance-2019/blob/gtsummary-demo/analysis/preliminaryAnalysis.Rmd#L523
The key line for pulling out the overall stat for a variable is: inline_text(tbl_counts_props, variable = has_guidance, column = stat_0, pattern = "{p}")
The issue is that stat_0
is not accessible in the CRAN version.
In the mosaic plot (epically long and for an eventual appendix), it would be nice to have discipline for each journal. Simplest solution coming to mind is to add a dot by the same side as the journal name or on axis.text.y.right. Alternatively could be grouped by discipline, but I like the current ranking by frequency better.
In the rmarkdown, I see that Scientific Data is an edge case with both journal and publisher guidance but didn't notice this in the write up. How is this handled? Does journal supersede publisher? Does it get a yes if in either journal or publisher? In any case, would be good to specify in text.
While this is beyond scope of our planned analysis, it's interesting that 9 journals have stats guidance but not on our topics. As a reader, I was wondering what topics they did address, so if this had a place for even a casual mention in the discussion (i.e., not a formal analysis), that would be interesting.
Regarding the topics to bring to the group, I agree with the two you have listed. I also am not fully convinced about collapsing journal and publisher-level guidance and would be curious to discuss that too.
I also wanted to share some more thoughts/background on the "prespecification of analyses"/clinical trial registration topic which we'll discuss more broadly. I think one thing we need to answer is whether "prespecification of outcomes" suffices for "prespecification of analyses". If so, I would definitely include ct registrations. However, I think specificying outcomes is insufficient and trial registration does not require a protocol or SAP. I'm adding some citations below on the topic.
WHO Trial Registrstion Data Set. https://www.who.int/clinical-trials-registry-platform/network/who-data-set
Some elements have statistical relavence, such as sample size and primary and key secondary outcomes. Outcomes should include: name, measurement metric or method, and timepoint. However, no "prespecification of analyses."
Munafò, M. R., Nosek, B. A., Bishop, D. V. M., Button, K. S., Chambers, C. D., Sert, N. P. du, Simonsohn, U., Wagenmakers, E.-J., Ware, J. J., & Ioannidis, J. P. A. (2017). A manifesto for reproducible science. Nature Human Behaviour, 1(1), 1–9. https://doi.org/10.1038/s41562-016-0021
See pg. 3 "Promoting study pre-registration"
Gamble, C., Krishan, A., Stocken, D., Lewis, S., Juszczak, E., Doré, C., Williamson, P. R., Altman, D. G., Montgomery, A., Lim, P., Berlin, J., Senn, S., Day, S., Barbachano, Y., & Loder, E. (2017). Guidelines for the Content of Statistical Analysis Plans in Clinical Trials. JAMA, 318(23), 2337–2343. https://doi.org/10.1001/jama.2017.18556
Goldacre, B., Drysdale, H., Dale, A., Milosevic, I., Slade, E., Hartley, P., Marston, C., Powell-Smith, A., Heneghan, C., & Mahtani, K. R. (2019). COMPare: A prospective cohort study correcting and monitoring 58 misreported trials in real time. Trials, 20(1), 118. https://doi.org/10.1186/s13063-019-3173-2
Wager, E., & Williams, P. (2013). “Hardly worth the effort”? Medical journals’ policies and their editors’ and publishers’ views on trial registration and publication bias: quantitative and qualitative study. BMJ, 347. https://doi.org/10.1136/bmj.f5248
Was just chatting with a collaborator (Nick Devito) on a clinical trials project I'm working on and brought this up. He shared a few more papers that are related to this tension (below).
Also, clarifying my argument, the issue I see is that a clinical trial registration is not as comprehensive as a registration with a prespecification of analysis (no SAP), but it definitely goes in the direction and limits statistical foul play (prespecifying sample size, outcome, study design). I'd be very comfortable coding both as "preregistrations" and then dividing into subgroups so it's clear that there is a difference but that we still acknowledge that journals asking for trial registration are going in the right direction. Or some sort of workaround that allows us to capture but differentiate both types.
Goldacre, B., Drysdale, H., Marston, C., Mahtani, K. R., Dale, A., Milosevic, I., Slade, E., Hartley, P., & Heneghan, C. (2019). COMPare: Qualitative analysis of researchers’ responses to critical correspondence on a cohort of 58 misreported trials. Trials, 20(1), 124. https://doi.org/10.1186/s13063-019-3172-3
And a few reviews, again mostly focused on outcome switching since outcome is the crucial prespecification in trial registries. Jones, C. W., Keil, L. G., Holland, W. C., Caughey, M. C., & Platts-Mills, T. F. (2015). Comparison of registered and published outcomes in randomized controlled trials: A systematic review. BMC Medicine, 13(1), 282. https://doi.org/10.1186/s12916-015-0520-3
On outcome switching. Of relevance is "Additional analyses" section which notes the ambiguity of outcome specification in registrations
Li, G., Abbade, L. P. F., Nwosu, I., Jin, Y., Leenus, A., Maaz, M., Wang, M., Bhatt, M., Zielinski, L., Sanger, N., Bantoto, B., Luo, C., Shams, I., Shahid, H., Chang, Y., Sun, G., Mbuagbaw, L., Samaan, Z., Levine, M. A. H., … Thabane, L. (2018). A systematic review of comparisons between protocols or registrations and full reports in primary biomedical research. BMC Medical Research Methodology, 18(1), 9. https://doi.org/10.1186/s12874-017-0465-7
TARG Meta-Research Group, Thibault, R. T., Clark, R., Pedder, H., Akker, O. van den, Westwood, S., Thompson, J., & Munafo, M. (2021). Estimating the prevalence of discrepancies between study registrations and publications: A systematic review and meta-analyses (p. 2021.07.07.21259868) [Preprint]. https://doi.org/10.1101/2021.07.07.21259868
This is great thanks! I'll do a bit more work on the preliminary report
Hi Maia,
I've now completed a full first draft of the preliminary analyses. I've also done some data extraction for study two and present those results too. Some of the coding is a bit grubby but we can tidy up for the main paper. I think this is about ready to send to the team. Do you want to take a look and let me know if there's anything were missing or can improve? Feel free to add anything.
Responses to some of the specific issues you raised above:
I've removed anything related to gtsummary. Its quick and convenient for generating a summary table, but typically I want to use the same summary data for in-text reporting and plots, so it made sense to create my own dataframe of summary stats. I did update to the dev version as you suggested and saw that the in text reporting function worked, but wasn't sure how to use the data for plots.
mosaic plot, agree it would be nice to show field but leaving it for now for expediency. Can address if we include in the paper.
Scientific Data, added a note
journals that had guidance not covered by our twenty topics - yes maybe we should add something on this. However, journals that did offer guidance covered by our twenty topics may also have had other guidance. So we are getting into territory that might require a lot of additional coding.
re: analysis prespecification, I agree the ideal would be to recode everything and state which guidance was about clinical trial registration and which was more detailed. But again this needs to be weighed with the fact it will require a lot of re-coding effort.
Hi Tom! Took another look, and I agree it's ready for discussion.
journals that had guidance not covered by our twenty topics - yes maybe we should add something on this. However, journals that did offer guidance covered by our twenty topics may also have had other guidance. So we are getting into territory that might require a lot of additional coding. Agreed, would just note one or two as narrative discussion
re: analysis prespecification, I agree the ideal would be to recode everything and state which guidance was about clinical trial registration and which was more detailed. But again this needs to be weighed with the fact it will require a lot of re-coding effort. curious to hear what others think. if not recoding, we can also note the limitation for a future effort
As I'm coalescing external guidance, I'll collect issues here