Open Daniel-Mietchen opened 7 years ago
The main purpose here would be to see
A simple query for "approval number" currently yields 11404 hits: https://www.ncbi.nlm.nih.gov/pmc/?term=%22approval+number%22
Just to clarify that conflict of interest statements are within scope here as well.
I just reran that "approval number" query from Oct 12, 2017, and it now yields 37963 results, i.e. an about 3.5-fold increase in about 3.5 years.
In the meantime, I have begun to collaborate with @petermr, and we are trying to use his ContentMine pipeline (which is currently being ported to Python) to extract ethics statements from PMC. On the way, we have built a first — still very rough — dictionary (i.e. a set of words highly indicative of the topic of ethics statements), and we are trying to also get a list of ethics committees mentioned in PMC-indexed papers.
Meeting on April 29, 2021:
Some more notes on this by @ShweataNHegde sit at https://github.com/petermr/dictionary/wiki/Ethics-Statement-Project .
A search for "approval number" now gives 38437 results, i.e. about 500 more than just two weeks ago.
There are ambiguities at multiple levels.
For instance, this article states that
This study was approved by the Johns Hopkins School of Medicine IRB, Approval Number: IRB00151734.
The problem here is that Johns Hopkins School of Medicine runs multiple IRBs, and there does not seem to be a straightforward mechanisms to resolve the approval number to get more metadata about the process.
There is a Office for Human Research Protections (OHRP) Database for Registered IORGs & IRBs, Approved FWAs, and Documents Received in Last 60 Days that has identifiers for IRBs, but these do not resolve either.
I have started to test the phrase extraction tool NLTK-RAKE. https://towardsdatascience.com/extracting-keyphrases-from-text-rake-and-gensim-in-python-eefd0fad582f As with all language tools it will take a day or two to see how useful it is.
On Mon, May 10, 2021 at 4:03 PM Daniel Mietchen @.***> wrote:
There is a Office for Human Research Protections (OHRP) Database for Registered IORGs & IRBs, Approved FWAs, and Documents Received in Last 60 Days https://ohrp.cit.nih.gov/search/irbsearch.aspx that has identifiers for IRBs, but these do not resolve either.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Daniel-Mietchen/ideas/issues/499#issuecomment-836810063, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTCSYDBWTBXH7DUKKFQMLTM7YUNANCNFSM4D5M32KA .
-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK
https://colab.research.google.com/drive/1sFj07mE2XRyeaplvsTs34-VaDHBjnt6U?usp=sharing
Ayush (openVirus volunteers) and I wrote a piece of code that can extract common phrases from a text file with manually scraped Ethics Statements.
Some updates from this week:
For more recent updates, see the notes over at Shweata's page.
Here is a list of ethics-related entities Shweata has mined from articles on stem cells.
Some more observations by Shweata and Peter sit here.
We now have a dedicated organization, repo and wiki:
The paper How does nursing research differ internationally? A bibliometric analysis of six countries. has a Table 1 that looks at certain features of previous studies, including
Extracted specific properties (e.g., contains ethics statements)
The project with Shweata and Peter (and Ayush) has since led to a publication:
Hegde SN, Garg A, Murray-Rust P, Mietchen D (2022) Mining the literature for ethics statements: A step towards standardizing research ethics. Research Ideas and Outcomes 8: e94685. https://doi.org/10.3897/rio.8.e94685 .
It outlines a workflow for mining ethics statements and discusses motivations, applications and complications.
possible search terms: