geneontology / go-annotation

This repository hosts the tracker for issues pertaining to GO annotations.
BSD 3-Clause "New" or "Revised" License
34 stars 10 forks source link

predatory journals and journal blacklist #2029

Closed ValWood closed 6 years ago

ValWood commented 6 years ago

It was pointed out to me at the WB SAB that some predatory journals are indexed BY PubMed.

I propose that we make a list of any predatory journals, or journals with a questionable review or absent review process and blacklist all for curation....

I came across this gem this week in my PubMed triage

Identification of Schizosaccharomyces pombe in the guts of healthy individuals and patients with colorectal cancer: preliminary evidence from a gut microbiome secretome study. Chin SF, Megat Mohd Azlan PIH, Mazlan L, Neoh HM. Gut Pathog. 2018 Jul 10;10:29. doi: 10.1186/s13099-018-0258-5. eCollection 2018.

PMID: 30008808

It contained such gems as:

( they are searching mass spec fragments against UniProt ffs, god knows what the hits are actually to, they don't provide any data)

Seriously how does this shit end up in PubMed?

ValWood commented 6 years ago

Oh , this article prompted my outburst:

https://www.the-scientist.com/news-opinion/german-scientists-frequently-publish-in-predatory-journals-64518

ValWood commented 6 years ago

@vanaukenk how did you say you deal with these?

vanaukenk commented 6 years ago

@ValWood We don't have a mechanism in place yet at WB to deal with potentially predatory journals and/or publishers. My recollection from the WB SAB was that people were generally relying on PubMed to be the gatekeeper, although this apparently may not be enough.
I think we could come up with criteria for papers for GO curation, i.e. must be from a peer-reviewed journal, that would help us steer clear of potentially low quality papers. I think there is also a more general issue of interest here wrt triage and selecting high priority papers for curation. Under what circumstances would a curator choose to annotate a paper from a predatory journal vs a more reputable one?

ValWood commented 6 years ago

Yes I think this is something we should discuss on the next QC call. I sometimes see "low quality" or "scientifically out of data and very old" papers curated quite recently for GO.

Older papers are often easy and quick to curate, so if you personal/group/institute metric is numbers of papers then there are very appealing (sometimes a couple of experiments). I reckon most papers from this year take me ~3-4 hours (and up to 8) to go through , but an older paper can take me around 15 mins.

I don't think a curator would necessarily know that a journal was predatory, or not peer reviewed. Until the discussion at WormBase PubMed was my benchmark. Maybe we should compile a list of everyone who approaches us to publish if these papers are being indexed- but maybe people have better ideas to keep a check on this.........

pgaudet commented 6 years ago

We could do the same mechanism as for papers with findings later found to be incorrect, (not yet in place) and just make a list, and filter out papers from those journals ?

ValWood commented 6 years ago

So the curation tools would access these lists and ensure they were "unavaiable"?

It would be bad if curators curated from blacklisted papers (either out of data science, or blacklisted journal), and then the curation was discarded.

I don't have a feeling for how big the problem is. We tend to find that older papers where new knowledge supercedes we know about, and we just remove the associated data. It's quite rare because we are judicious about what we use for GO annotation, often it's just a misinterpretation of the underlying experiments. I'm not sure that this would be the case for all species? I have definitely seen a few where there has been a big shift, particularly doing QC.

RLovering commented 6 years ago

Hi Val this is a very big issue you are addressing. However, I think you should appreciate that for some very well researched genes the original very old paper made the first statements about the role/activity of a protein, which may have been duplicated by another group (or not) but then no further work has been done to confirm this role. For example with receptors where once the receptor/ligand relationship was established current experiments often don't actually show the ligand and receptor bind they just add both to the cell and look at what happens next. Ruth ;)

ValWood commented 6 years ago

I'm not saying that we should not curate older papers. We curate all of our older papers. Just that we should always view them in the context of more recent papers especially for well-studied proteins, which often shed light on the older results and clean up ambiguities . Older genetics papers often hint at a role which is subsequently never mentioned again (by the same authors), in more recent studies. We just need some mechanism(s) in place to identify and flag legacy ideas, stop annotations that were once added and removed reappearing, not waste time curating non-peer reviewed work. It would save a lot of time in the long run...

I'm not really addressing it....I wouldn't know where to begin. But it's something we could probably begin to deal with better as a group.

ValWood commented 6 years ago

There should be public lists we can use:

https://www.the-scientist.com/news-opinion/indian-government-aims-to-take-down-predatory-journals-64731?utm_campaign=TS_DAILY%20NEWSLETTER_2018&utm_source=hs_email&utm_medium=email&utm_content=65569194&_hsenc=p2ANqtz-8vKr7yMcdVq-SddM-VUDuhEkiuw_GUGkhM8JomWp1adoKTdafscdN7dP2Y-PP2zwhFVC3e0zD9SSshnGzZ6T9hyOHOCQ&_hsmi=65569194

pgaudet commented 6 years ago

Added this to the next discussion of the QC group.

@ValWood Do you have any idea how many annotations this may impact ?

Thanks, Pascale

ValWood commented 6 years ago

Maybe this is not a big issue. It was brought to my attention at the WB SAB. Apparently some predatory/ non-peer reviewed journals get indexed by PubMed. I did not know this. It would be good to have some way to alert contributing resources not to waste time curating from these journals. But maybe this can close and we can bear it in mind .... I'm sure that curators generally will use some common sense in deciding whether a publication is valuable to curate, and curate high priority papers first?

Perhaps just note this as something the QC group might want to look into further at some point. Maybe a much bigger problem is the outdated and unconfirmed information curated from older papers. Like the trm2 example I provided yesterday.

pfey03 commented 6 years ago

I read and heard about this in german media a couple of weeks ago and got this link https://predatoryjournals.com/journals/

It was reported that those are in general open access, which is a lot nowadays of course. They also charge scientists huge page charges. I"m sure like me, you get invitations every week to publish somewhere or be an editor somewhere.

It seems a more recent thing with the advance of open access journals. I don't think older papers are much of a problem, except that there might be better ways to text things nowadays.

ValWood commented 6 years ago

I was beginning to think this might not be a problem. However I was alerted to this yesterday A HGNC curator (Susan) read this https://www.ncbi.nlm.nih.gov/pubmed/30108417 because it came through on their human gene alerts. Now this paper contains NO information that isn't in the UniProt entry. Tables of amino acids. Domains. Blast searches. Alignments. PPI networks. It's like a 'workshop exercise'

Yesterday I also came across these: The popped (pooped) up as related papers, when I was looking at a paper about https://www.ncbi.nlm.nih.gov/pubmed/21038498 and https://www.ncbi.nlm.nih.gov/pubmed/20088090 So I'm not convinced that PubMed is very fussy at all about what it includes.....

pfey03 commented 6 years ago

Yeah, PubMed indexes all. They may need to refine their algorithm to check against certain list. The first one, Bioinformation, is listed on the predatory Journal list: https://predatoryjournals.com/journals/

The other 2 look absolutely fake to begin with. Maybe alert PubMed?

ValWood commented 6 years ago

We should tweet this link!

I will mail PubMed about the other two. Clearly a mistake ....;)

ValWood commented 6 years ago

I mailed PubMed. Also asked about their policy with regard to predatory journals.

pfey03 commented 6 years ago

Great!

I will tweet the link to the Dicty community from dictybase channel. Though possibly not main target group, it's diverse and changing (e.g. physicists and other non biologists order in the Dicty Stock Center).

ValWood commented 6 years ago

e.g. physicists and other non biologists order in the Dicty Stock Center

interesting!

pfey03 commented 6 years ago

I should have said physicists and non classical cell/mol biologists;-)

ValWood commented 6 years ago

Well- Re the letters:

"The two 'strange items' are letters which are generally indexed. Their titles and content might be a little unorthodox but they are letter to benefit patient education ."

Which I find really strange. I wonder if they are confusing "letter type articles" and the publisher tagged these as "letter". I don't really know how to proceed with this one!

and also I got a boiler plate response about predatory journals, which was a bit meaningless:

Regarding NLM policy for predatory journals, please see the following:

Journals eligible for MEDLINE and PMC must first be suitable for the NLM collection, based on the criteria in the NLM Collection Development Manual. Journals that are selected for the NLM collection must have sufficient subject matter within the scope of biomedicine and health-related life sciences and should demonstrate quality of editorial work, including features that contribute to the objectivity, credibility, and quality of its content. NLM looks for conformance with guidelines and best practices published by professional organizations, including Recommendations for the Conduct, Reporting, Editing, and Publication of Scholarly Work in Medical Journals from ICMJE, Code of Conduct and Best Practices Guidelines from COPE and Principles of Transparency and Best Practice in Scholarly Publishing (joint statement by COPE, DOAJ, WAME, and OASPA). The NLM Catalog includes publications that are held in the Library’s collections, that contain NIH-funded papers, or that support other NLM programs, products and services such as DOCLINE, PMC, MEDLINE, GenBank, and others.

ValWood commented 6 years ago

I get it....... I guess that the publishers love these links to "patient questions". To read the answer to the question (which isn't accessible), you would need to purchase the journal..... Ka-ching!

ValWood commented 6 years ago

To be sure, this is intentional !

Me: I still think this must be a mistake. The 'journal" is called Mayo Clin Health Lett. and has "letter type" articles.

These are user questions (like a letters page" and MUST have been indexed inadvertently? It is difficult to tell, this "journal" is behind a paywall.

Dear Valerie Wood,

I assure you it is not a mistake. Mayo Clin Health Lett was fully indexed from Jan 1998 through June 2017. This can be seen at locatorplus.gov: https://locatorplus.gov/cgi-bin/Pwebrecon.cgi?Search_Arg=mayo+clin+health+lett&Search_Code=2100&PID=H62dPwgIBpAT9_H3ll8nzeTV&SEQ=20180919054921&CNT=25&HIST=1

ValWood commented 6 years ago

How can we work if we can't rely on some sanity and review criteria by PubMed? It seems that they will index anything

addiehl commented 6 years ago

I think it's entirely within the purview of Pubmed to index the Mayo Clin Health Lett, which is a patient focused health information news letter published by the Mayo Clinic on a subscription basis to the general public. The Mayo Clinic is a well-known hospital system based in Minnesota with an extensive research arm. Mayo attracts patients from across the US and internationally due to its high reputation, and undoubtedly produces the Mayo Clin Health Lett as a service to the community. Looking at the articles in the Mayo Clin Health Lett (available via my university), it is clear that they are intended not as definitive statements or reviews of scholarly research, but as medical advice distilled down to a level that is accessible by the general public. Pubmed was not created simply to serve the academic research community, but as a resource for the public in general to find information about biomedical research and about medical care in general. Indexing of Mayo Clin Health Lett clearly provides a way to expose the public to articles that may be of interest to them.

I appreciate your concerns about predatory journals, but I would also hope that most GO curators can spot dubious articles and journals based on their education and experience. Certainly in the mouse and human domains, there is so much information published in reputable journals that curators never have time to get to, that there should be little risk of curation of demonstrably false information coming from predatory journals. It is unfortunate if some GO curation efforts are judged on quantity of annotations rather than quality, but hopefully even in these situations curators will have the scientific and moral judgment to avoid dubious articles.

Most predatory journals are easy to spot if you simply look at their publication history. Few articles per year going back only a few years at best. And despite your concerns, I find that they are usually not indexed by Pubmed. I have checked a number of times the names of journals that solicit me for articles or editorial board membership, and it is pretty obvious when the journal is not above board. The biggest risk of predatory journals seems to me to be that some academics may use them to create publication records that may confuse hiring, tenure, or promotion committees or may be used by companies to promote their products or services with "peer" reviewed research.

deustp01 commented 6 years ago

In the end, this is a question of annotation strategy: do you want your dataset to be comprehensive, including all assertions about your domain that are said to be based on experimental data, or do you want to impose some quality criteria of your own beyond the editorial standards of a journal like Cell to ensure that your annotations are not only high-profile, newsworthy, and of interest to your users but also accurate? Never mind the little parasitic weed journals. What about the very prominent biomedical journals that happily published all those tainted reports from Baselga and colleagues (https://www.nytimes.com/2018/09/.../jose-baselga-cancer-memorial-sloan-kettering.html)?

addiehl commented 6 years ago

Unfortunately, it is probably beyond the ability of most GO curators, or indeed, most scientists outside a specific subdomain of biology or medicine, to spot fraudulent or ethically compromised publications that are published in reputable peer reviewed journals prior to their retraction. It would be good, however, to remove annotations for articles that have been made from articles that are subsequently retracted, and I know upon occasion that this has been done.

ValWood commented 6 years ago

Hi @addiehl I was not objecting to the "letters" themselves these are "articles". But these "reader questions" hardly seem to constitute reviewed medical articles? https://www.ncbi.nlm.nih.gov/pubmed/21038498

ValWood commented 6 years ago

@deustp01 I don't think we should need to impose editorial standards. Bad papers will always be published, and we need ways to flag these too.

What I would expect, is that PubMed would only index articles from journals with a robust review procedure. This used to be the case, I remember extensive vetting of journals before they were indexed. ....but it clearly isn't any longer.

@addiehl many curators can spot dubious articles, but we need to read them to do so and this is a waste of time. Once again the burden lands on the curator.

pgaudet commented 6 years ago

@ValWood I put some of this information here:

http://wiki.geneontology.org/index.php/Tips_to_Produce_High_Quality_Annotations#Avoiding_predatory_journals

Can we close the ticket for now ? (Feel free to suggest edits to the wiki page).

Thanks, Pascale

ValWood commented 6 years ago

yep