NYUCCL / psiTurk

An open platform for science on Amazon Mechanical Turk.
https://psiturk.org
MIT License
278 stars 140 forks source link

linking data to code used to obtain the data using git hashes #509

Open gureckis opened 3 years ago

gureckis commented 3 years ago

psiturk stores the codeversion variable in a column in the assignment_table_name. However, a more natural target if the experimenter is using git is to use the current git hash of the code base for the experiment. This would tie the data collected from an experiment specifically to the code written to collect the experiment. I have implemented this in other projects using the python-git-info package but there are several ways to pull this off.

I guess the issue would be that not everyone would use git to manage their experiment project folder but anyone using heroku now would. also, like, it makes sense to add this feature and encourage it because the provenance of some data is very important. linking the data from a subject to the code used to run that subject is, well, the gold standard in replicability.

my specific proposal is a major version change though because it would add a column to the assignments_table_name that tracked the git info for the experiment code base in a new column of meta-data about the assignment. I guess if no github repo is found or anything weird happens (see "the pack of snakes" here), then it should default to some unknown value perhaps.

deargle commented 3 years ago

Hrm, very often I push minor bug fixes for codeversions that I wouldn't want to reset the random assignment distribution or the campaign stuff. What if we just added a column for the git hash that didn't also replace codeversion functionality?

On Thu, May 20, 2021, 12:40 AM Todd Gureckis @.***> wrote:

psiturk stores the codeversion variable in a column in the assignment_table_name. However, a more natural target if the experimenter is using git is to use the current git hash of the code base for the experiment. This would tie the data collected from an experiment specifically to the code written to collect the experiment. I have implemented this in other projects using the python-git-info package https://pypi.org/project/python-git-info/ but there are several ways to pull this off.

I guess the issue would be that not everyone would use git to manage their experiment project folder but anyone using heroku now would. also, like, it makes sense to add this feature and encourage it because the provenance of some data is very important. linking the data from a subject to the code used to run that subject is, well, the gold standard in replicability.

my specific proposal is a major version change though because it would add a column to the assignments_table_name that tracked the git info for the experiment code base in a new column of meta-data about the assignment. I guess if no github repo is found or anything weird happens (see "the pack of snakes" here https://pypi.org/project/python-git-info/), then it should default to some unknown value perhaps.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/NYUCCL/psiTurk/issues/509, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAI6Y7PNDEDDZEBVDSGWAVDTOSVE5ANCNFSM45GFVF2Q .

jacob-lee commented 3 years ago

I agree with Dave.

Also, versioning is the responsibility of the developer or researcher. Introducing and enforcing an opinionated versioning procedure because some people might fail to make good use of versioning just hurts everyone else who does but does it another way.

Also, imo it is a mistake long term to hardcode integration of a specific versioning system into a code-base like this; these are separate concerns (and imagine being stuck with svn).

jmuchovej commented 2 years ago

Could a hybrid of these two be used? (e.g., track the git sha and codeversion?)

While I learned (painfully) in my first use of psiTurk that codeversion matters, I think having the git sha would have been quite helpful since I could use it as a fallback.

Another alternative (albeit "very advanced" – read as: I highly doubt many will use this, myself included) could be to use the git tagging system, git tag <codeversion-equivalent> as a way to track versions independent of manual updates to config.txt.

jacob-lee commented 2 years ago

I see a couple issues.

We want the psiturk version recorded. But that shouldn't be in config file. Perhaps it's recorded in some of the JSON somewhere...

We want the task version recorded. But should that be in the psiturk config? I'm not so sure it should be. Like any other package it should have it's own version number info, and requirements specs (including crucially the version of psiturk).

More generally, tasks are painfully entangled with the psiturk app itself. The tasks should be importable as a python module, or something similar.

These are serious issues that have caused many headaches for me, managing experiments over multiple years.

On Tue, Mar 1, 2022, 12:07 PM John Muchovej @.***> wrote:

Could a hybrid of these two be used? (e.g., track the git sha and codeversion?)

While I learned (painfully) in my first use of psiTurk that codeversion matters, I think having the git sha would have been quite helpful since I could use it as a fallback.

Another alternative (albeit "very advanced" – read as: I highly doubt many will use this, myself included) could be to use the git tagging system, git tag as a way to track versions independent of manual updates to config.txt.

— Reply to this email directly, view it on GitHub https://github.com/NYUCCL/psiTurk/issues/509#issuecomment-1055661579, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAM5SQIQS4BUIRVV2QPGXE3U5ZFEHANCNFSM45GFVF2Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

jmuchovej commented 2 years ago

Hmm... this seems to be different from what I was referring to.

Are you saying that every codeversion should have an analog of conda/pip freeze? That would be cool, but intuitively psiTurk doesn't feel like "the right tool" for that. (e.g., Docker or Singularity seem better suited as "freezers".) Maybe psiTurk manages/enforces that? Though... I could see how that could turn into an adoption/usage nightmare, given the primary users of psiTurk (grad students, per @deargle's mention in discussions).

I think doing a conda/pip freeze equivalent would be cool, but seems aside the point that @gureckis was pointing at?

Also, I'm curious what you mean by:

The tasks should be importable as a python module, or something similar.

I think I do something similar to what you're referring to. I have experiments/{{ codeversion }}/task.js and the like (I actually store a config.txt for each, but have to manually symlink this myself). Getting this setup to work nicely required a a few custom_code.route(...)s and passing the codeversion around to any $.ajax(...) calls. I think I only found it "easy" to debug because I had ascended the jQuery learning curve and have a pretty lengthy web background. I'm not sure it's safe to assume many users would have the requisite knowledge for this. 😕 (Let alone the time/patience to acquire it.)

jacob-lee commented 2 years ago

A task will consist of a set of templates, resource files, some JavaScript (e.g. task.js), and custom routes, etc. In some cases they may depend on a set of database tables not provided by psiturk.

Psiturk itself has a number of templates, routes, js etc independent of a particular task. It is the framework intended to make running web experiments easier.

A custom task may go through several iterations. To differentiate the data output of one version from another, one should use some sort of text string to identify the versions (for example using a semantic versioning scheme). That code version string needs to get embedded in the data someway. Currently there is a non-task specific config file... Along with database secrets, that specifies the code version that gets associated with the data. It is misplaced, in fact it is treated as if it were a study descriptor, when it comes to condition balancing!

Study. Psiturk version. Task version.

If I'm developing a psiturk task, derived from psiturk example, the code base for that task includes, must include, elements of the psiturk application itself. They are entangled. The only way to disentangle currently is to write the code separately, and use scripts that copy in the task specific files to the appropriate directories.

Instead, task development ought to be clearly distinct from the application, and it's requirements (e.g. a specific version of psiturk);should be clearly specified, eg in a requirements.txt file.

The problem with using a git hash or some such for versioning is that a hash difference doesn't tell you how different. Removing a space from a file vs a complete rewrite can't be seen by how different the hashes are. Fundamentally it's not semantic. That's why we use version numbers, use releases, and keep a changes log.

In the early days of psiturk, there was the promise of plug and play tasks. What I quickly learned was that there was no dependency management. Tasks that ran on one version of psitirk, didn't run on later ones.

On Tue, Mar 1, 2022, 2:01 PM John Muchovej @.***> wrote:

Hmm... this seems to be different from what I was referring to.

Are you saying that every codeversion should have an analog of conda/pip freeze? That would be cool, but intuitively psiTurk doesn't feel like "the right tool" for that. (e.g., Docker or Singularity seem better suited as "freezers".) Maybe psiTurk manages/enforces that? Though... I could see how that could turn into an adoption/usage nightmare, given the primary users of psiTurk (grad students, per @deargle https://github.com/deargle's mention in discussions https://github.com/NYUCCL/psiTurk/discussions/542#discussioncomment-2270270 ).

I think doing a conda/pip freeze equivalent would be cool, but seems aside the point that @gureckis https://github.com/gureckis was pointing at?

Also, I'm curious what you mean by:

The tasks should be importable as a python module, or something similar.

I think I do something similar to what you're referring to. I have experiments/{{ codeversion }}/task.js and the like (I actually store a config.txt for each, but have to manually symlink this myself). Getting this setup to work nicely required a a few custom_code.route(...)s and passing the codeversion around to any $.ajax(...) calls. I think I only found it "easy" to debug because I had ascended the jQuery learning curve and have a pretty lengthy web background. I'm not sure it's safe to assume many users would have the requisite knowledge for this. 😕 (Let alone the time/patience to acquire it.)

— Reply to this email directly, view it on GitHub https://github.com/NYUCCL/psiTurk/issues/509#issuecomment-1055758795, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAM5SQOTD7EKXEELWERBO7LU5ZSOZANCNFSM45GFVF2Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

jmuchovej commented 2 years ago

Ahh. I see now. Hmm... so, that seems like it would require psiTurk to be more opinionated that it currently is to achieve that. Much like how Ruby on Rails forces particular application structures.

That seems sensible, but also presents a bit of a problem with non-MTurk crowdsourcing (e.g., Prolific). I'm not sure that there's a way to prevent experiment deployment without a version update on any platform except for those with an API. 😕

Also, re:psiTurk versioning – that seems like something a pip freeze, or better yet a conda freeze, should do no? (I prefer conda here explicitly because it's a dependency manager (unlike pip, well, older versions of pip aren't dependency managers)). I know that psiTurk doesn't have a conda package of late, but I've been looking into getting that back up and running.

I guess I'm at a loss why a psiTurk version would/should change over the course of a research paper/project, if the researcher is doing proper dependency management. (Lol, big assumption here, but I think it makes sense, unless psiTurk runs into a critical bug, but that shouldn't be a patch update, I think?)

I definitely understand specifying psiTurk versions to encourage reproducibility, but I'm not sure it makes sense to do snapshots at a task level (strictly on the basis that tasks are part of projects). Allowing for different psiTurk versions across tasks seems redundant and not in a good/useful way. (Though I'm still fresh to the land of psiTurk, so I probably haven't run into the scenario(s) to justify this.)

deargle commented 2 years ago

I guess I'm at a loss why a psiTurk version would/should change over the course of a research paper/project

My psiturk version often changes, usually because I'm developing and releasing new features to make data collection easier. Like dashboard stuff. In hindsight, it might have been better to make the dashboard a psiturk plugin.

unless psiTurk runs into a critical bug, but that shouldn't be a patch update, I think?

Huge mistakes have been made with semver (by me!). Won't happen again, fingers crossed.

specifying psiTurk versions to encourage reproducibility

Backwards-compatibility to run old psiturk tasks has gotten somewhat easier with later releases, but there's still the problem of later versions adding the mode column to the database. psiTurk does not gracefully handle that column missing. Although I did write a psiturk.db.migrate_db function, callable via the cli, which could in theory add that column, if missing.

jacob-lee commented 2 years ago

Sometimes studies last years. Sometimes, we come back and want to run the same experiment from five years ago, with a few tweaks. In the meantime, the world has changed. The world has switched from python2 to 3; security bugs have been found and patched, etc. In some cases the old version of psiturk won't even work any more (e.g. because it assumed the existence of the psiturk server, or because of outdated SSL libraries). So you need to, and should, use updated versions of psiturk. But that's not always easy (e.g. the mode column issue, Dave spoke about). You end up having to update task code (and sometimes you end up messing something up along the way). The cleaner the separation between what a user has to develop, and psiturk, the easier that is to manage.

In general, again, I'd like a much cleaner separation between the psiturk application and the tasks people develop, with a standard way of importing them in. Ends up not being that opinionated, really; being able to develop your own routes, javascript, etc. already affords a tremendous degree of freedom.

jacob-lee commented 2 years ago

ad.html is (or was) a good example of the problem. Every study has to modify the template, but it contains javascript code necessary for psiturk to function.

deargle commented 2 years ago

Oh my heck, ad.html, which we had to change the filename of (to pub.html) and css classes within (from .ad to .not-an-ad) so that it wouldn't trigger ad blockers.

edit: although strictly speaking, ad.html is only necessary if running on mturk. Not a hard requirement for psiturk if running in lab (anything besides 'live' or 'sandbox') modes.