Open agitter opened 1 year ago
Oh no! I'm definitely interested in keeping an eye on this.
Also noting the URL format used for resolution: https://clinicaltrials.gov/ct2/show/NCT04619628. This URL currently redirects to the classic view, although I imagine eventually it which switch to redirect to the new view if classic is retired.
I think updating Bioregistry is a good idea. I'll make a PR for that.
I'm not sure whether the Zotero citation infrastructure that Manubot relies on will be stable
This is the bigger worry IMO, so good to keep in mind when we upgrade the bioregistry version in Manubot, and perhaps to test so we can report any bugs to Zotero beforehand.
One solution for upgrading manubot is to simply generate URLs that are https://bioregistry.io/
perhaps to test so we can report any bugs to Zotero beforehand.
Directly testing the classic and new website formats indicates there will be problems. I'm assuming the Zotero translator is used for both.
$ manubot cite https://classic.clinicaltrials.gov/ct2/show/NCT04619628
[
{
"id": "12D6rB04F",
"type": "webpage",
"abstract": "Safety and Immunogenicity of COVI-VAC, a Live Attenuated Vaccine Against COVID-19 - Full Text View.",
"language": "en",
"title": "Safety and Immunogenicity of COVI-VAC, a Live Attenuated Vaccine Against COVID-19 - Full Text View - ClinicalTrials.gov",
"URL": "https://clinicaltrials.gov/ct2/show/NCT04619628",
"accessed": {
"date-parts": [
[
"2023",
6,
29
]
]
},
"note": "This CSL Item was generated by Manubot v0.5.5 from its persistent identifier (standard_id).\nstandard_id: url:https://classic.clinicaltrials.gov/ct2/show/NCT04619628"
}
]
$ manubot cite https://www.clinicaltrials.gov/study/NCT04619628
[
{
"id": "aHoxDwRa",
"type": "webpage",
"title": "CTG Labs - NCBI",
"URL": "https://www.clinicaltrials.gov/study/NCT04619628",
"accessed": {
"date-parts": [
[
"2023",
6,
29
]
]
},
"note": "This CSL Item was generated by Manubot v0.5.5 from its persistent identifier (standard_id).\nstandard_id: url:https://www.clinicaltrials.gov/study/NCT04619628"
}
]
This trial ID isn't even the best example because it doesn't set all the metadata that https://github.com/zotero/translators/pull/2153 added support for, like the creators.
I opened a similar issue for the Zotero translator. I don't know enough Javascript to make the updates myself. https://github.com/zotero/translators/issues/3069
Good news on this, Zotero contributors responded to my issue and updated the clinicaltrials.gov.js
translator. I believe we would need to update the Zotero translation-server (https://github.com/manubot/manubot/issues/82) before we can test those changes in Manubot.
Nice! I just updated the Manubot translation-server to zotero/translators@28f344cd, which includes https://github.com/zotero/translators/commit/edde70110f8bff8a69480f9cbc1e544851f25b74. But I'm still getting the same result as above for manubot cite https://www.clinicaltrials.gov/study/NCT04619628
with the title as CTG Labs - NCBI
. I would expect this now to be the actual title, kind of confused.
It seems like your translation-server update worked. I tested another URL from the test case of a recent commit https://github.com/zotero/translators/commit/aa7d6a2b685e10db744fa5641f2d662a027cf880
$ curl -d 'https://www.nrc.nl/nieuws/2022/12/03/wikipedia-wordt-onbetrouwbaar-alweer-a4150299' -H 'Content-Type: text/plain' https://translate.manubot.org/web
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 589 100 507 100 82 479 77 0:00:01 0:00:01 --:--:-- 557[{"key":"ZTT5BNZK","version":0,"itemType":"newspaperArticle","creators":[{"firstName":"Maxim","lastName":"Februari","creatorType":"author"}],"tags":[],"title":"Column | Wikipedia wordt onbetrouwbaar. Alweer","publicationTitle":"NRC","rights":"Copyright Mediahuis NRC BV","url":"https://www.nrc.nl/nieuws/2022/12/03/wikipedia-wordt-onbetrouwbaar-alweer-a4150299","abstractNote":"Column:Maxim Februari","date":"2022-12-03","language":"nl-NL","libraryCatalog":"www.nrc.nl","accessDate":"2023-07-13T15:42:12Z"}]
The output matches at first glance.
The output does not match the classic clinical trials URL test case:
$ curl -d 'https://classic.clinicaltrials.gov/ct2/show/NCT04292899' -H 'Content-Type: text/plain' https://translate.manubot.org/web
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 599 100 544 100 55 408 41 0:00:01 0:00:01 --:--:-- 449[{"key":"5KR82YAW","version":0,"itemType":"webpage","creators":[],"tags":[],"title":"Study to Evaluate the Safety and Antiviral Activity of Remdesivir (GS-5734™) in Participants With Severe Coronavirus Disease (COVID-19) - Full Text View - ClinicalTrials.gov","url":"https://clinicaltrials.gov/ct2/show/NCT04292899","abstractNote":"Study to Evaluate the Safety and Antiviral Activity of Remdesivir (GS-5734™) in Participants With Severe Coronavirus Disease (COVID-19) - Full Text View.","language":"en","accessDate":"2023-07-13T15:42:56Z"}]
For example, the creators are empty.
ClinicalTrials.gov updated their website (announcement). I'm not sure whether the Zotero citation infrastructure that Manubot relies on will be stable. The classic website will be retired.
Here's an example of the new and classic sites:
I believe we need to update https://github.com/biopragmatics/bioregistry/ and then the Manubot package to update these URLs. Is that correct @dhimmel?