Daniel-Mietchen / ideas

A dumping ground for halfbaked ideas, some of which will hopefully be worked on soon
Other
26 stars 6 forks source link

Mine PMC for ethics statements #499

Open Daniel-Mietchen opened 7 years ago

Daniel-Mietchen commented 7 years ago

possible search terms:

Daniel-Mietchen commented 7 years ago

The main purpose here would be to see

Daniel-Mietchen commented 7 years ago

A simple query for "approval number" currently yields 11404 hits: https://www.ncbi.nlm.nih.gov/pmc/?term=%22approval+number%22

Daniel-Mietchen commented 5 years ago

Just to clarify that conflict of interest statements are within scope here as well.

Daniel-Mietchen commented 3 years ago

I just reran that "approval number" query from Oct 12, 2017, and it now yields 37963 results, i.e. an about 3.5-fold increase in about 3.5 years.

In the meantime, I have begun to collaborate with @petermr, and we are trying to use his ContentMine pipeline (which is currently being ported to Python) to extract ethics statements from PMC. On the way, we have built a first — still very rough — dictionary (i.e. a set of words highly indicative of the topic of ethics statements), and we are trying to also get a list of ethics committees mentioned in PMC-indexed papers.

Daniel-Mietchen commented 3 years ago

Meeting on April 29, 2021:

Daniel-Mietchen commented 3 years ago

Some more notes on this by @ShweataNHegde sit at https://github.com/petermr/dictionary/wiki/Ethics-Statement-Project .

Daniel-Mietchen commented 3 years ago

A search for "approval number" now gives 38437 results, i.e. about 500 more than just two weeks ago.

Daniel-Mietchen commented 3 years ago

There are ambiguities at multiple levels.

For instance, this article states that

This study was approved by the Johns Hopkins School of Medicine IRB, Approval Number: IRB00151734. 

The problem here is that Johns Hopkins School of Medicine runs multiple IRBs, and there does not seem to be a straightforward mechanisms to resolve the approval number to get more metadata about the process.

Daniel-Mietchen commented 3 years ago

There is a Office for Human Research Protections (OHRP) Database for Registered IORGs & IRBs, Approved FWAs, and Documents Received in Last 60 Days that has identifiers for IRBs, but these do not resolve either.

petermr commented 3 years ago

I have started to test the phrase extraction tool NLTK-RAKE. https://towardsdatascience.com/extracting-keyphrases-from-text-rake-and-gensim-in-python-eefd0fad582f As with all language tools it will take a day or two to see how useful it is.

On Mon, May 10, 2021 at 4:03 PM Daniel Mietchen @.***> wrote:

There is a Office for Human Research Protections (OHRP) Database for Registered IORGs & IRBs, Approved FWAs, and Documents Received in Last 60 Days https://ohrp.cit.nih.gov/search/irbsearch.aspx that has identifiers for IRBs, but these do not resolve either.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Daniel-Mietchen/ideas/issues/499#issuecomment-836810063, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTCSYDBWTBXH7DUKKFQMLTM7YUNANCNFSM4D5M32KA .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

ShweataNHegde commented 3 years ago

https://colab.research.google.com/drive/1sFj07mE2XRyeaplvsTs34-VaDHBjnt6U?usp=sharing

Ayush (openVirus volunteers) and I wrote a piece of code that can extract common phrases from a text file with manually scraped Ethics Statements.

Daniel-Mietchen commented 3 years ago

Some updates from this week:

Daniel-Mietchen commented 3 years ago

For more recent updates, see the notes over at Shweata's page.

Daniel-Mietchen commented 3 years ago

Here is a list of ethics-related entities Shweata has mined from articles on stem cells.

Daniel-Mietchen commented 3 years ago

Some more observations by Shweata and Peter sit here.

We now have a dedicated organization, repo and wiki:

Daniel-Mietchen commented 3 years ago

The paper How does nursing research differ internationally? A bibliometric analysis of six countries. has a Table 1 that looks at certain features of previous studies, including

Extracted specific properties (e.g., contains ethics statements)

Daniel-Mietchen commented 1 year ago

The project with Shweata and Peter (and Ayush) has since led to a publication:

Hegde SN, Garg A, Murray-Rust P, Mietchen D (2022) Mining the literature for ethics statements: A step towards standardizing research ethics. Research Ideas and Outcomes 8: e94685. https://doi.org/10.3897/rio.8.e94685 .

It outlines a workflow for mining ethics statements and discusses motivations, applications and complications.