Closed vsoch closed 7 years ago
I'm just going to do this myself then.
I just wanted to outline my use cases as you're about to touch this, and I'm a real user :)
Local sequence
This is a variant of local battery, as I don't want to randomise the order of my tasks (#134). Using the shell script in #132 to work around this issue, my studies run nice and smoothly, with the slight problem that I have to ^c the script to move to the next experiment/survey. This is not an issue in the lab.
Online sequence
I'm about to start developing this use case. The limitations in local sequence will become an issue here for obvious reasons. I evaluated docker, and it's a little over-specified for my requirements. It looks fairly simple to push json results to a web app, and this is the approach I'm planning to investigate when porting my local sequence studies to run unattended online.
There's also some glue code to write to neatly organise task data by participant. I know that writing the json to a database would be neatest, but I expect I will start by writing CSV files (experiments) and json files (surveys) to a participant directory. This will allow me to use the same analysis code (R) I use to analyse locally collected data.
Very grateful if you're able to incorporate online sequence into your work, and happy to help. I will be working on a solution regardless with the aim of having something working by December 2017-January 2018.
hey @earcanal ! Docker (or Singularity) would mean that you wouldn't need to think about these installation / dependency details - you would just run one command and the software /databases would be "frozen" in a sense for you to just get up and running. I work on quite a few open source projects so I can't make promises with regard to time, but I've definitely started on this and will put together some examples when they are ready!
hey @earcanal - how / when do you input the participant ID for both online and local use cases? If you had to serve your own MySQL and just configure it with the local / online application, would that work for you?
Awesome question! I don't, so I have to manually convert the auto-generated UIDs to participant numbers as part of my manual (horrible, proprietary) file renaming process! Almost anything would be better than this.
As an experimenter, (without me having to do anything) I would like each study to automatically generate a participant__id, starting from 1, and auto_incremented this for each new participant. I would like to specify a study_id (string) when configuring the study. I would like participant_id and study_id columns in data tables (database or files) for all study objects (experiments and tasks).
Thanks this is great! And yes that's exactly what we used to do back in the days when I was an RA in a psychology lab. I'll make some default study_id that you can edit when you build your battery (container)
User Story Mapping and Writing Effective Use Cases are all the best books :)
EDIT
The following was written anticipating a collaboration, but that doesn't seem to have worked out so I'm rolling this on my own, using more reproducible methods.
@teonbrooks this is mostly for you! I'm going to put some notes here for you to see when you get back, and for discussion.
Right now expfactory has two use cases. The docker (online client) which is pretty much useless for anyone but Poldracklab, and this is because we can't safely store others data, nor can we store credentials. The other use case (which likely does have users) is for local labs to run, meaning generating an experiment on the fly, or generating a static battery to serve. With these goals in mind, I want to first propose the following functionality for expfactory-python:
Experiments
What is an experiment?
An experiment is a github repo. It is static, meaning that it can be run if you start a webbrowser in the present working directory, and it's exp_id corresponds to its organization/folder name. We can think of the organization/repo name akin to a registry in docker. If the user doesn't specify one, the "default" is assumed to be 'expfactory-experiments` so this:
is really the same as
Thus, to ask for a fork of that (@teonbrooks) I would do:
How do we summarize experiments?
Given that each folder is an experiment, we simply have one base folder under the expfactory organization (likely [expfactory/experiments]()) that is cloned and contains all metadata for official experiments. We will want this to be done in an automated fashion - a single experiment has a repo, given that it passes testing, a PR is automatically sent to update the "current version" or add a branch, etc. to the metadata folder. The metadata folder would be a lot like the Docker standard library - a central place to look up where the actual experiments are, and their versions, etc.
How does a user interact with experiments?
The workflow will be similar to now:
However, instead of generating a temporary folder on the fly, what we would want to do is caching. A user will have a
expfactory
folder in their$HOME
where experiments can be cached, and this essentially means github repos. For example, the stroop experiment would be stored like this:And this is an organizational approach similar to GoLang so that, if I wanted to work on the code for teonbrooks/stroop, I would do that via this base. This means that, for running experiments, the
exp_id
becomes now both the user/organization name AND the experiment. This also means that we have to ask a little bit more to the user at run time to generate / preview an experiment, if they don't want the default (expfactory) version. Then, we would have commands in the expfactory client that make them act just like a package manager, but for Github and experiments. Eg:So this is the first part of the proposal
$HOME/expfactory
as a local cache. Thoughts?I will put my other suggested modifications in different issues, to be specific.