Open ekraffmiller opened 1 month ago
There are multiple options for how the response could be formatted:
Option 1. json list with two objects. each object contains only the modified fields. ex. [{'id'=versionAid, 'subject'='version A subject', 'subtitle'=''},{'id'=versionBid, 'subject'='New subject', 'subtitle'='new subtitle'}]
Option 2. json response with before and after values: ex. {'subject'= {'versionAid' = 'version A subject', 'versionBid'='New subject'}, 'subtitle'={'versionAid' = '', 'versionBid'='new subtitle'}}
I'm sure there could be more options. @ekraffmiller could you let me know what format would make the most sense for the SPA code?
FWIW: I think the outputs from the DatasetVersionDifference class are more like option 2. Similarly, I think that's the format closer to how we display the differences in the dataset page version table.
@ekraffmiller Here is the Json formatted output that I believe will work well in a table on the UI. Please let me know if this works or if changes are needed.
{
"status": "OK",
"data": {
"Metadata": {
"Author": {
"0": "Finch, Fiona; (Birds Inc.)",
"1": "Finch, Fiona; (Birds Inc.); Poe, Edgar Allen; (Baltimore Poets); Mulligan, Hercules; (Sons of Liberty)"
},
"Subject": {
"0": "Medicine, Health and Life Sciences",
"1": "Medicine, Health and Life Sciences; Astronomy and Astrophysics; Other"
},
"Producer": {
"0": "",
"1": "Allen, Irwin; (MGM); Spielberg, Stephen; (ILM)"
},
"Design Type": {
"0": "",
"1": "Parallel Group Design; Nested Case Control Design"
}
},
"Files": {
"added": [
{
"description": "",
"label": "dataverseproject.png",
"restricted": false,
"version": 1,
"datasetVersionId": 4,
"dataFile": {
"id": 11,
"persistentId": "",
"filename": "dataverseproject.png",
"contentType": "image/png",
"friendlyType": "PNG Image",
"filesize": 12918,
"description": "",
"storageIdentifier": "local://19296b38e55-71601b050f3d",
"rootDataFileId": -1,
"md5": "e55e66ff785045154875c4b6841eb527",
"checksum": {
"type": "MD5",
"value": "e55e66ff785045154875c4b6841eb527"
},
"tabularData": false,
"creationDate": "2024-10-16",
"fileAccessRequest": true
}
}
],
"removed": [
{
"description": "",
"label": "dataverseproject_logo.jpg",
"restricted": false,
"version": 1,
"datasetVersionId": 3,
"dataFile": {
"id": 10,
"persistentId": "",
"filename": "dataverseproject_logo.jpg",
"contentType": "image/jpeg",
"friendlyType": "JPEG Image",
"filesize": 4462,
"description": "",
"storageIdentifier": "local://19296b371ed-ea4ec196219e",
"rootDataFileId": -1,
"md5": "c1edbefa86a55c5037873370ae7fd7b6",
"checksum": {
"type": "MD5",
"value": "c1edbefa86a55c5037873370ae7fd7b6"
},
"tabularData": false,
"creationDate": "2024-10-16",
"publicationDate": "2024-10-16",
"fileAccessRequest": true
}
}
],
"modified": [
{
"fileMetadata": {
"description": "",
"label": "dataverse-icon-1200.png",
"restricted": false,
"version": 1,
"datasetVersionId": 3,
"dataFile": {
"id": 9,
"persistentId": "",
"filename": "dataverse-icon-1200.png",
"contentType": "image/png",
"friendlyType": "PNG Image",
"filesize": 27650,
"description": "",
"storageIdentifier": "local://19296b370c7-b90cd887fd36",
"rootDataFileId": -1,
"md5": "a23eb44803d9127bc6e055f77b869816",
"checksum": {
"type": "MD5",
"value": "a23eb44803d9127bc6e055f77b869816"
},
"tabularData": false,
"creationDate": "2024-10-16",
"publicationDate": "2024-10-16",
"fileAccessRequest": true
}
},
"isRestricted": {
"0": "false",
"1": "true"
}
}
]
},
"TermsOfAccess": {
"Data Access Place": {
"0": "",
"1": "Somewhere"
}
}
}
}
thanks @stevenwinship I will review the SPA requirements today
Hi @stevenwinship sorry for the late reply, for the Compare Version Details Popup, we will need the changes grouped by metadata block. Also it would be more flexible in the UI to have the changed values in an array (for "multiple" type fields.)
Here is an example:
{
"oldVersion": {
"versionNumber": "1.0",
"createdDate": "2023-01-15T08:00:00Z"
},
"newVersion": {
"versionNumber": "1.1",
"createdDate": "2024-01-20T08:00:00Z"
},
"metadataChanges": [
{
"blockName": "citation",
"changed": [
{
"fieldName": "title",
"oldValue": ["Initial Dataset Title"],
"newValue": ["Updated Dataset Title"]
},
{
"fieldName": "author",
"oldValue": ["John Doe"],
"newValue": ["John Doe", "Jane Smith"]
}
]
},
{
"blockName": "socialscience",
"changed": [
{
"fieldName": "studyDesignType",
"oldValue": ["design type 1","design type 2"],
"newValue": ["design type 1a", "design type 1b", "design type 1c"]
}
]
}
],
"fileChanges": [
{
"fileName": "data.csv",
"changes": [
{
"fieldName": "filePath",
"oldValue": "/oldpath/data_v1.csv",
"newValue": "/newpathdata_v2.csv"
}
]
},
{
"fileName": "readme.txt",
"changes": [
{
"fieldName": "description",
"oldValue": "Basic dataset info",
"newValue": "Updated dataset info with more details"
}
]
}
]
}
I'm sorry I realized there is some missing file information in the JSON example I sent you, here is an updated example. I have added fields to the file elements. I also included a 'filesReplaced" array. Other changes:
{
"oldVersion": {
"versionNumber": "1.0",
"lastUpdatedDate": "2023-01-15T08:00:00Z"
},
"newVersion": {
"versionNumber": "1.1",
"lastUpdatedDate": "2024-01-20T08:00:00Z"
},
"metadataChanges": [
{
"blockName": "citation",
"changed": [
{
"fieldName": "title",
"oldValue": ["Initial Dataset Title"],
"newValue": ["Updated Dataset Title"]
},
{
"fieldName": "author",
"oldValue": ["John Doe"],
"newValue": ["John Doe", "Jane Smith"]
}
]
},
{
"blockName": "socialscience",
"changed": [
{
"fieldName": "studyDesignType",
"oldValue": ["design type 1", "design type 2"],
"newValue": ["design type 1a", "design type 1b", "design type 1c"]
}
]
}
],
"filesAdded": [
{
"fileName": "teacher_survey.tab",
"md5": "1234567890",
"type": "Tab-Delimited",
"fileId": 3,
"tags": ["Documentation"],
"description": "my file description",
"isRestricted": false
},
{
"fileName": "biomedical.json",
"md5": "1234567890",
"type": "JSON",
"fileId": 4,
"tags": ["Documentation", "Data"],
"description": "my json file description",
"isRestricted": true
}
],
"filesReplaced": [
{
"oldFile": {
"fileName": "teacher_survey.tab",
"md5": "1234567890",
"type": "Tab-Delimited",
"fileId": 3,
"tags": ["Documentation", "Data"],
"description": "my json file description",
"isRestricted": false
},
"newFile": {
"fileName": "biomedical.json",
"md5": "1234567890",
"type": "JSON",
"fileId": 4,
"tags": ["Documentation", "Data"],
"description": "my json file description",
"isRestricted": true
}
},
{
"oldFile": {
"fileName": "test1.json",
"md5": "1234567890",
"type": "JSON",
"fileId": 3,
"isRestricted": false
},
"newFile": {
"fileName": "test2.json",
"md5": "1234567890",
"type": "JSON",
"fileId": 4,
"isRestricted": true
}
}
],
"filesChanged": [
{
"fileName": "data.csv",
"md5": "1234567890",
"fileId": 1,
"changes": [
{
"fieldName": "filePath",
"oldValue": "/oldpath/data_v1.csv",
"newValue": "/newpathdata_v2.csv"
}
]
},
{
"fileName": "readme.txt",
"md5": "1234567890",
"fileId": 2,
"changes": [
{
"fieldName": "description",
"oldValue": "Basic dataset info",
"newValue": "Updated dataset info with more details"
}
]
}
]
"TermsOfAccess": {
"changed": [
{
"fieldName": "dataAccessPlace",
"oldValue": "",
"newValue": "Somewhere"
}
]
}
}
Here is an example of the latest json format:
{
"status": "OK",
"data": {
"oldVersion": {
"versionNumber": "1.0",
"lastUpdatedDate": "2024-10-24T15:17:11Z"
},
"newVersion": {
"versionNumber": "DRAFT",
"lastUpdatedDate": "2024-10-24T15:17:16Z"
},
"metadataChanges": [
{
"blockName": "Citation Metadata",
"changed": [
{
"fieldName": "Author",
"oldValue": "Finch, Fiona; (Birds Inc.)",
"newValue": "Finch, Fiona; (Birds Inc.); Poe, Edgar Allen; (Baltimore Poets); Mulligan, Hercules; (Sons of Liberty)"
},
{
"fieldName": "Subject",
"oldValue": "Medicine, Health and Life Sciences",
"newValue": "Medicine, Health and Life Sciences; Astronomy and Astrophysics; Other"
},
{
"fieldName": "Producer",
"oldValue": "",
"newValue": "Allen, Irwin; (MGM); Spielberg, Stephen; (ILM)"
}
]
},
{
"blockName": "Life Sciences Metadata",
"changed": [
{
"fieldName": "Design Type",
"oldValue": "",
"newValue": "Parallel Group Design; Nested Case Control Design"
}
]
}
],
"filesAdded": [
{
"fileName": "test.tab",
"filePath": "data/subdir1",
"MD5": "77c7f03a7d7772907b43f0b322cef723",
"type": "text/tab-separated-values",
"fileId": 42,
"description": "my description",
"isRestricted": false,
"categories": [
"Data"
],
"tags": [
"Survey"
]
}
],
"filesRemoved": [
{
"fileName": "dataverseproject_logo.jpg",
"filePath": "data/subdir1",
"MD5": "c1edbefa86a55c5037873370ae7fd7b6",
"type": "image/jpeg",
"fileId": 40,
"description": "my description",
"isRestricted": false,
"categories": [
"Data"
]
}
],
"filesReplaced": [
{
"oldFile": {
"fileName": "favicon-16x16.png",
"filePath": "data/subdir1",
"MD5": "d3c852e7ecb92fd105ba4018116a9be8",
"type": "image/png",
"fileId": 41,
"description": "my description",
"isRestricted": false,
"categories": [
"Data"
]
},
"newFile": {
"fileName": "favicon-32x32.png",
"filePath": "data/subdir1",
"MD5": "c931f7add8b6a1f9a691046b77c231fa",
"type": "image/png",
"fileId": 43,
"description": "my description",
"isRestricted": false,
"categories": [
"Data"
]
}
}
],
"fileChanges": [
{
"fileName": "dataverse-icon-1200.png",
"MD5": "a23eb44803d9127bc6e055f77b869816",
"fileId": 39,
"changed": [
{
"fieldName": "isRestricted",
"oldValue": "false",
"newValue": "true"
}
]
}
],
"TermsOfAccess": {
"changed": [
{
"fieldName": "Data Access Place",
"oldValue": "",
"newValue": "Somewhere"
}
]
}
}
}
Overview of the Feature Request Need an API endpoint that will compare two dataset versions and return a list of differences between the versions. This is needed to support the SPA Dataset Page
What kind of user is the feature intended for? (Example users roles: API User, Curator, Depositor, Guest, Superuser, Sysadmin) API User
What inspired the request? https://github.com/IQSS/dataverse-client-javascript/issues/197 https://github.com/IQSS/dataverse-frontend/issues/511
What existing behavior do you want changed? None
Any brand new behavior do you want to add to Dataverse? New Dataverse API endpoint
Any open or closed issues related to this feature request?
Are you thinking about creating a pull request for this feature?
Help is always welcome, is this feature something you or your organization plan to implement?