Version each survey independently

Watts-Lab / surveys

Library of surveys for deliberation experiments

MIT License

3 stars 4 forks source link

Version each survey independently #102

Closed JamesPHoughton closed 1 year ago

JamesPHoughton commented 1 year ago

As a data consumer, I want to know that all of the data generated by MySpecialSurvey is commensurable (i.e. is generated by the same version of the survey). Currently, we assign a version to the survey repo as a whole, but not to individual surveys. What this means is that if we make changes to MyDumbSurvey, the version number also changes for MySpecialSurvey.

Currently, the survey and scoring function in deliberation-empirica produce results that look like:

{
    "surveySource":"@watts-lab/surveys",
    "surveyVersion":"1.5.2",
    "surveyName":"listeningQualityPartner",
    "responses":{
       "tryToUnderstand":9,
       "askedQuestions":9,
       "encouragedClarification":9,
       "expressedInterest":9,
       "listenedAttentively":9,
       "paidAttention":9,
       "gaveSpace":9,
       "undividedAttention":9,
       "positiveAtmosphere":9,
       "allowedExpression":9
    },
    "result":{
       "rawScore":90,
       "normScore":1
    },
    "secondsElapsed":0,
    "playerId":"01GKM7G4RJPEKMR209R8QTGW69",
    "gameId":"01GKM1SWT851YM0YMRYT2C2PD2",
    "exitStep":"noExitStep"
 }

where surveyVersion is the repo version.

Instead, we want to have a unique version (or hash) for each unique survey that only changes when that survey itself changes - specifically the survey specification MySpecialSurvey.json and the scoring function MySpecialSurvey.score.js.

We may be able to just add something to the SurveyFactory function which hashes the surveyJson and the score.js function and adds them to this data object. (here: https://github.com/Watts-Lab/surveys/blob/main/src/surveyFactory.jsx#L43)

Otherwise, we can use a pre-commit hook that checks for changes to those files (or just hashes the files) and saves that version identifier in such a way that it can be included in the data dump.

markwhiting commented 1 year ago

Cool.

Another thought is using the blob URL for those files (blob urls only change when the file changes not on new commits to the repo). It will however mean that there is no version information in the survey file itself, so external users might have a harder time integrating version information. Also, blob urls can be a little finicky with private repos, so I could imagine that if we ever had a private survey that we wanted to do this with we might need a different policy there.

JamesPHoughton commented 1 year ago

That's a good point - we don't only want to know that different data objects come from the same survey object, but we also want to be able to trace (and for a human to read) which version of the survey the data came from. Right now, you can sortof use the package version to look up the code that was used to generate the survey, which lets you do the back-tracing/interpretation, even if it doesn't let you group data by the version of the unique survey. (and isn't super straightforward)

We could store both - the package version, and a hash of the individual survey implementation - to meet the survey grouping and documentation requirements. It may make it easier if we make an official GitHub "release" for each package version, so that you can easily pull up the code by version (instead of by commit).

JamesPHoughton commented 1 year ago

Blob URL has the advantage that it is unique to the file version, and takes you right to the file, which is super convenient. I don't know at the moment how we'd get the blob URL though to include in the packaged library? I'm sure there must be a way for the GH action that builds the library for npmjs to get it and include it somewhere. Might be tricky to debug, though...

What I was thinking about earlier was to hash the survey json object after it has been read into the SurveyFactory. There are a few problems I'm recognizing in how we would implement that approach:

this happens at run-time (client side), and so would be run every time data was saved
js objects don't guarantee order, so functionally isomorphic survey definitions could yield different hashes
probably adds a runtime dependency, as js has no built-in hash function (?!)

Another way to do it would be to hash the .json file and .score.js file during the pre-build stage for rollup. Save these in a separate .json file in each survey directory, and import the hashes in index.js and pass them along to the surveyFactory.

This adds an additional file to each folder, and complexity to the build script and and index imports.
Means that if you have the hash, you can search the git history directly git log -S "hashvalue" -p to find the exact version of the survey, without necessarily needing to know the package version number.
Hashing library can be a dev dependency (probably?) and hashes can be generated/checked offline, especially if we use a standard hashing library.

markwhiting commented 1 year ago

OK.

Idea: what if we have a test that checks that the version number (committed in code) is a particular hash of a particular file, and then offer a simple script to get that hash and write it in on each survey manifest file.

So like the repo has a script called make_survey_version.sh and you need to run that before committing, or a test will fail if you've made any changes to a survey.

I am not sure if that could be run by an action or a pre-commit hook, but if so that would be even better.

A few things I think this does that are slightly nicer than another file is that it ensures that the version information is literally in the survey at all times, and it ensures that nothing can ever be committed that does not have accurate version numbers.

One shortcoming is that you basically want to take the hash when the version number information is not present in the file, so that it doesn't cause a break down. Though we could also chain hashes, i.e., include the previous version's number to make the next version's number. Either way, I think a script could work around this, i.e., just remove that line from the JSON before calculating the hash.

markwhiting commented 1 year ago

After looking at the code (I probably should have done that before commenting), I think we get just about everything from the solution you've implemented.

JamesPHoughton commented 1 year ago

what if we have a test that checks that the version number (committed in code) is a particular hash of a particular file, and then offer a simple script to get that hash and write it in on each survey manifest file.

I like the idea of a script checking that the hash isn't out of date in the repo. The build script that packages the library for npmjs will compute new hashes, so we know that the hashes are correct in the library as its used in experiments. But, it would be good to have confidence that the hash is up to date all the time in the commit history, so that if you try and identify a particular version from the hash, you can't have errors. Maybe this is a good thing to check with a pre-commit hook. I'll have to look into adding that. Looks like husky is the way to go.

There is an argument also for manually versioning each survey (in addition to including a hash), which is that you could use semantic versioning for each change, such that a change to the Major version would mean that you shouldn't interpret the survey in the same way (breaking change), a Minor version bump could be something that the survey author believes shouldn't change the interpretation of the survey (e.g. a formatting change in the survey display), but you still want to document. A Patch version could be something that doesn't change anything about the way the survey is presented to the user, but may change how it is written, eg. a dev comment or stylistic changes to the file (e.g. changing line spacing in the json file) that give a different hash but make no change to how the survey is perceived.

In the long run, I don't know how frequently that sort of thing would be used, especially if we enforce the use of "prettier" to get consistent code styling.

markwhiting commented 1 year ago

Interesting. I think for safety, it might be better if we assume all changes are not breaking, and if we need to, we should be able to flag if versions earlier than some version are considered wrong. i.e., results compiled from any version of the survey and its scoring function should be commensurable. If, say, a newer version of RME comes out that adds another dimension of measure, instead of making RME v2, I'd vote to make a totally new thing, e.g., a new survey named RME_V2, so that we don't have the threat of it being considered commensurable.

In any case, all this sounds really nice!