greenelab / covid19-review

A collaborative review of the emerging COVID-19 literature. Join the chat here:
https://gitter.im/covid19-review/community
Other
116 stars 81 forks source link

ClinicalTrials.gov website updates #1210

Open agitter opened 1 year ago

agitter commented 1 year ago

ClinicalTrials.gov updated their website (announcement). I'm not sure whether the Zotero citation infrastructure that Manubot relies on will be stable. The classic website will be retired.

Here's an example of the new and classic sites:

I believe we need to update https://github.com/biopragmatics/bioregistry/ and then the Manubot package to update these URLs. Is that correct @dhimmel?

rando2 commented 1 year ago

Oh no! I'm definitely interested in keeping an eye on this.

dhimmel commented 1 year ago

Also noting the URL format used for resolution: https://clinicaltrials.gov/ct2/show/NCT04619628. This URL currently redirects to the classic view, although I imagine eventually it which switch to redirect to the new view if classic is retired.

I think updating Bioregistry is a good idea. I'll make a PR for that.

I'm not sure whether the Zotero citation infrastructure that Manubot relies on will be stable

This is the bigger worry IMO, so good to keep in mind when we upgrade the bioregistry version in Manubot, and perhaps to test so we can report any bugs to Zotero beforehand.

cthoyt commented 1 year ago

One solution for upgrading manubot is to simply generate URLs that are https://bioregistry.io/:, then you don't have to update this in manubot ever since this can be handled upstream

agitter commented 1 year ago

perhaps to test so we can report any bugs to Zotero beforehand.

Directly testing the classic and new website formats indicates there will be problems. I'm assuming the Zotero translator is used for both.

$ manubot cite https://classic.clinicaltrials.gov/ct2/show/NCT04619628
[
  {
    "id": "12D6rB04F",
    "type": "webpage",
    "abstract": "Safety and Immunogenicity of COVI-VAC, a Live Attenuated Vaccine Against COVID-19 - Full Text View.",
    "language": "en",
    "title": "Safety and Immunogenicity of COVI-VAC, a Live Attenuated Vaccine Against COVID-19 - Full Text View - ClinicalTrials.gov",
    "URL": "https://clinicaltrials.gov/ct2/show/NCT04619628",
    "accessed": {
      "date-parts": [
        [
          "2023",
          6,
          29
        ]
      ]
    },
    "note": "This CSL Item was generated by Manubot v0.5.5 from its persistent identifier (standard_id).\nstandard_id: url:https://classic.clinicaltrials.gov/ct2/show/NCT04619628"
  }
]
$ manubot cite https://www.clinicaltrials.gov/study/NCT04619628
[
  {
    "id": "aHoxDwRa",
    "type": "webpage",
    "title": "CTG Labs - NCBI",
    "URL": "https://www.clinicaltrials.gov/study/NCT04619628",
    "accessed": {
      "date-parts": [
        [
          "2023",
          6,
          29
        ]
      ]
    },
    "note": "This CSL Item was generated by Manubot v0.5.5 from its persistent identifier (standard_id).\nstandard_id: url:https://www.clinicaltrials.gov/study/NCT04619628"
  }
]

This trial ID isn't even the best example because it doesn't set all the metadata that https://github.com/zotero/translators/pull/2153 added support for, like the creators.

agitter commented 1 year ago

I opened a similar issue for the Zotero translator. I don't know enough Javascript to make the updates myself. https://github.com/zotero/translators/issues/3069

agitter commented 1 year ago

Good news on this, Zotero contributors responded to my issue and updated the clinicaltrials.gov.js translator. I believe we would need to update the Zotero translation-server (https://github.com/manubot/manubot/issues/82) before we can test those changes in Manubot.

dhimmel commented 1 year ago

Nice! I just updated the Manubot translation-server to zotero/translators@28f344cd, which includes https://github.com/zotero/translators/commit/edde70110f8bff8a69480f9cbc1e544851f25b74. But I'm still getting the same result as above for manubot cite https://www.clinicaltrials.gov/study/NCT04619628 with the title as CTG Labs - NCBI. I would expect this now to be the actual title, kind of confused.

agitter commented 1 year ago

It seems like your translation-server update worked. I tested another URL from the test case of a recent commit https://github.com/zotero/translators/commit/aa7d6a2b685e10db744fa5641f2d662a027cf880

$ curl -d 'https://www.nrc.nl/nieuws/2022/12/03/wikipedia-wordt-onbetrouwbaar-alweer-a4150299' -H 'Content-Type: text/plain' https://translate.manubot.org/web
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   589  100   507  100    82    479     77  0:00:01  0:00:01 --:--:--   557[{"key":"ZTT5BNZK","version":0,"itemType":"newspaperArticle","creators":[{"firstName":"Maxim","lastName":"Februari","creatorType":"author"}],"tags":[],"title":"Column | Wikipedia wordt onbetrouwbaar. Alweer","publicationTitle":"NRC","rights":"Copyright Mediahuis NRC BV","url":"https://www.nrc.nl/nieuws/2022/12/03/wikipedia-wordt-onbetrouwbaar-alweer-a4150299","abstractNote":"Column:Maxim Februari","date":"2022-12-03","language":"nl-NL","libraryCatalog":"www.nrc.nl","accessDate":"2023-07-13T15:42:12Z"}]

The output matches at first glance.

The output does not match the classic clinical trials URL test case:

$ curl -d 'https://classic.clinicaltrials.gov/ct2/show/NCT04292899' -H 'Content-Type: text/plain' https://translate.manubot.org/web
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   599  100   544  100    55    408     41  0:00:01  0:00:01 --:--:--   449[{"key":"5KR82YAW","version":0,"itemType":"webpage","creators":[],"tags":[],"title":"Study to Evaluate the Safety and Antiviral Activity of Remdesivir (GS-5734™) in Participants With Severe Coronavirus Disease (COVID-19) - Full Text View - ClinicalTrials.gov","url":"https://clinicaltrials.gov/ct2/show/NCT04292899","abstractNote":"Study to Evaluate the Safety and Antiviral Activity of Remdesivir (GS-5734™) in Participants With Severe Coronavirus Disease (COVID-19) - Full Text View.","language":"en","accessDate":"2023-07-13T15:42:56Z"}]

For example, the creators are empty.