acl-org / acl-anthology

Data and software for building the ACL Anthology.
https://aclanthology.org
Apache License 2.0
426 stars 284 forks source link

Retraction notice has no canonical page #760

Open aryamccarthy opened 4 years ago

aryamccarthy commented 4 years ago

I stumbled upon this PDF today: https://www.aclweb.org/anthology/W16-3709.pdf. It's a retraction notice for a paper. The link https://www.aclweb.org/anthology/W16-3709, though, is dead. This might be an accident of the changed canonical URLs.

Should we make a new canonical page without BibTeX, etc.? Should we redirect the canonical page to the PDF retraction notice? The way it is, nothing points to this retraction notice.

mbollmann commented 4 years ago

We should probably make this information explicit in the XML and display it accordingly on the website.

On the one hand, we don't want to put retracted papers in the spotlight (e.g., they won't be very useful if they come up as search results), but on the other hand I think it's important to be transparent. What are your thoughts @mjpost? Or is there any policy for this already?

Personally, I wasn't aware that there were any retracted papers in the Anthology. There are probably others in the list of unlinked PDFs I generated. This one for example: https://www.aclweb.org/anthology/W14-5510.pdf

mjpost commented 4 years ago

This may be coming up with ACL 2020. I suggest we follow the ACM policy on retractions, which lists four levels:

Focusing just on retractions, I suggest the following:

akoehn commented 4 years ago

That sounds like a sensible approach.

Withdrawal should not affect us at all, corrections are already handles the process for removal is also clear.

mbollmann commented 4 years ago
* we remove the paper from page listings (author, venue, volume)

That would still be pretty intransparent, IMO. It means that there's no way to see which or how many papers in the Anthology have been retracted at some point, unless you happen to have a direct link to those papers. I'd suggest listing it in greyed-out font with a clear "retracted" pill before the title.

mjpost commented 4 years ago

The list above was an attempt to balance two competing issues:

  1. We need to ensure that retracted papers are visible to those who might otherwise cite them.
  2. We don't want to (further) disincentivize paper retraction.

I think there's a point to be made that adding a paper to a list, and displaying it in such a manner that it jumps out at anyone browsing the list, draws attention to the authors of the work when it should be on the paper. This could then work as a disincentive towards retraction.

(There's a counterargument that negative impressions should not attend here, since the authors have actually done something admirable that can and does happen to many researchers, but unfortunately I think this is not how things work).

I think this argues for removing the paper from all listings. I would also suggest that we mark the title in the paper's BibTeX, e.g., "[RETRACTED] A proof that P != NP by reduction to sentiment analysis".

A compromise might be to:

mbollmann commented 4 years ago

I see the issue and haven't given it enough thought to comment on it, but to clarify, my main concern is that if we remove all links to retracted papers, we make information about where and how often retractions happen more difficult to find. (The average user wouldn't know that we store this in XML files in our repo.) This could give the impression that we intentionally try to obscure when retractions happen, which is not a good look IMO.

mjpost commented 4 years ago

These are good points. Consistent with both (1) and (2) above, I think we should also:

mjpost commented 4 years ago

I now have a retracted paper in hand. For the short term, in light of the discussion above, I plan to do this (comments welcome):

Is this suitable technically, and as a compromise between transparency and punitiveness?

CC: @acl-org/anthology