Closed cessda-bitbucket-importer closed 1 year ago
Original comment by Stefan Dlugolinsky (GitHub: Stifo).
The following mysql code generates uri, uri_sl and canonical_uri for all the published versions that have either value null. For each such version it takes uri, uri_sl and canonical uri of its previous version and replaces the version number in them in order to generate new values. The code should be executed at MySQL server:
SET @SQL_SAFE_UPDATES=@@SQL_SAFE_UPDATES;
SET SQL_SAFE_UPDATES = 0;
update version as dst
left join version as src on
dst.previous_version = src.id
set
dst.canonical_uri = case
when dst.canonical_uri is null then regexp_replace(src.canonical_uri, regexp_replace(src.number, '([0-9]+)\.([0-9]+)(\.[0-9]+)?', '$1\\\\.$2(\\\\.[0-9]+)?'), dst.number)
else dst.canonical_uri
end,
dst.uri = case
when dst.uri is null then regexp_replace(src.uri, regexp_replace(src.number, '([0-9]+)\.([0-9]+)(\.[0-9]+)?', '$1\\\\.$2(\\\\.[0-9]+)?'), dst.number)
else dst.uri
end,
dst.uri_sl = case
when dst.uri_sl is null then regexp_replace(src.uri_sl, regexp_replace(src.number, '([0-9]+)\.([0-9]+)(\.[0-9]+)?', '$1\\\\.$2(\\\\.[0-9]+)?'), dst.number)
else dst.uri_sl
end
where
dst.status = 'PUBLISHED'
and (
dst.canonical_uri is null
or
dst.uri is null
or
dst.uri_sl is null
)
;
SET SQL_SAFE_UPDATES = @SQL_SAFE_UPDATES;
There are some other issues, probably in the frontend, which alter the version number before displaying it:
Original comment by Martin Šeleng (GitHub: pakoselo).
First I want to ask @Joshocan to exeute the script on dev and staging (thanks for that), to be able to test it by @dolinarm and to address the issue descibed by @Stifo I already noticed the problem with dispalying wrong version(s) and started to work on that.
Original comment by Martin Šeleng (GitHub: pakoselo).
@dolinarm @Stifo I have repaired the numbering in the Title. Sections “CVs search” “Editor CVs search”, but the citation, cannonical uri and urn are generated during publishing the CVs as a SL admin, so there is some discrepancy. If later some of the TL admin(s) creates new TL version and next the SL admin publish the new TL version, the URI(s) and URN(s) for already published SL and Tl(s) are not updated, only the corresponding version numbers (there is a patch number, the last of 3 digits updated + 1). I am not sure if they should be updated or not, because they weren’t published as a new versions, they are just updated because of the new TL version. So, right now leave it as it is.
Original comment by Joshua Tetteh Ocansey (GitHub: Joshocan).
thanks @pakoselo Script deployed. @dolinarm check and test for functionality in dev and staging.
Original comment by Maja Dolinar.
@pakoselo @Joshocan I tested this on dev and staging and it is working fine, so the problem is solved there. Please move this into production as well.
Original comment by Joshua Tetteh Ocansey (GitHub: Joshocan).
@dolinarm @john-shepherdson Need to plan for incremental releases for CVS
Original comment by Martin Šeleng (GitHub: pakoselo).
@Joshocan Can you please provide us (myself and @Stifo ) latest production database dump. To test it once again locally. Also as you suggest we need to plan the incremental release, also without the migrate button in the maintenance section.
Original comment by Maja Dolinar.
The Canonical URIs are now showing on staging, however, the versioning is not right. It does not correspond to whatever is selected (language or version) and is very confusing.
I had a discussion with Taina about versioning and it should be like this: If later some of the TL admin(s) create a new TL version and next the SL admin publish the new TL version, the URI(s) and URN(s) for already published SL and TL(s) SHOULD BE updated as well since their version number changes (a newly published package includes all the language variants that have up-to-date translations to SL content - a language variant is dropped from a newly published CV version only if the SL has changed and they have not updated for that. If only one TL has changed and SL remains the same, it means all other TLs are still valid and are included in a newly published package).
@MajaDolinar I've took a look on it and found a logic problem that was not discussed before: if everything is published, a new TL is created, reviewed, set ready to be published and a bundle is published (no change in SL, just in that particular TL), then version number of the SL and already published TLs is just updated and previous version is lost. Technically, it is still there, the content remains, but it has overwritten version number. This breaks the track of successive patch versions; e.g.: an SL version 2.1.0 is overwritten by 2.1.1 and 2.1.0 is non-existent. Similarly, 2.1.1 can be overwritten by 2.1.2 and 2.1.1 is non-existent and so on, so there are successive versions 2.0.5, 2.1.9. I suggest to clone already published SL and TLs in this case and update the version number for them as well as URIs. I already have a code for this, but I need to test it a little bit.
fixed by #526 PR
@Stifo sorry it took so long to answer to this, I was trying to figure this one out and I had a discussion with Taina again. Here are the clarifications: Currently, the canonical URIs show the version number of the base SL version but this should be changed.
New system:
Previously:
The canonical URI is formed from whatever is entered in the Agency information in the element ‘Canonical Uri’, with the version number added to the end. The version number added should be the whole package version number. Right now the versions in staging in all front-end displays of published vocabularies do not have only one and the same version number everywhere, including a canonical urn, citations, downloads, address lines on top of the page etc across all languages. There should be only one version number everywhere.
Thanks @MajaDolinar, i did it as described. However, we still need to regenerate the citations and update version numbers in it. Could you please a separate issue for that? The canonicalURI and available from show now correct versions. I'm not sure, when the fix appears in the dev/staging after recent migration to github.
FYI: uri and uri_sl is still present in the db and code
Original report on BitBucket by Maja Dolinar.
If you go to the CV “CDC Publisher Names” at https://vocabularies.cessda.eu/vocabulary/CdcPublisherNames?lang=en and open the tab ‘License and Citation” there is a mistake in 'Available from’ section. See the picture below:
I tested this for other CVs (DDI, CESSDA) and the issue is present everywhere.
In tab ‘Identity and general’ Canonical URIs are missing everywhere.
The issue was reported from EOSC Helpdesk: https://eosc-helpdesk.eosc-portal.eu/#ticket/zoom/2197