FelixHenninger / lab.js

Online research made easy
https://lab.js.org/
Other
239 stars 107 forks source link

reproducible experiments? #9

Open vsoch opened 6 years ago

vsoch commented 6 years ago

hey @FelixHenninger ! I stumbled on your page for labjs and it's great! I am finishing up a project that was started years ago, the Experiment Factory, and it's focus isn't around making experiments, but deploying them in a reproducible way (meaning linux containers). A lot of the experiments are from jspsych, or phaser (a game framework) or just simple form --> submit types, and all are simple html/css/javascript. I don't know a lot about labjs, but it seems like the experiments are created and hosted from your server? I am wondering if there is a way to "freeze" or "export" an experiment, then we could add them to the Experiment Factory library. If the user wanted hosting (and not the container) it would link back to you, and if not, they could use the traditional expfactory way of generating the experiment in the container (and then being able to give the container to others to reproduce exactly, cite in papers, etc). Depending on the way labjs works, we could also do some kind of reverse import - meaning an experiment factory library experiment could be imported into labjs. I can offer to help on both these fronts.

Let me know your thoughts! Looking forward to chatting. And your UI is beautiful by the way! I love the marigold yellow.

FelixHenninger commented 6 years ago

Hi Vanessa,

thank you so much for reaching out! I've seen your project (and your work in general) and love it; in fact, I think The Experiment Factory would combine really well with lab.js. We've tried not to reinvent the wheel regarding server-side data collection; in fact we are already fairly close to what you're describing: Studies can be exported to zip archives that include all study code and (optionally) a PHP backend for data storage on our users' servers. We don't provide a default data collection framework (or, for that matter, any server-side logic in the builder UI; the study preview is served by a service worker on the client side).

Integration with expfactory would be absolutely terrific! We've been in touch with @TeonBrooks, and we have an experimental export to his expfactory-server package (the export options can be accessed in the UI from the drop-down menu next to the save button). I've read (and very much enjoyed) your articles, but I haven't setup or used expfactory myself -- in principle though, it should be absolutely no problem to create an export option that includes, e.g. a Dockerfile and/or a config.json and a folder structure for static media. If you could walk me through the necessary steps, I would be thrilled to do the work on our side (or of course, if you'd like to contribute, please be very warmly invited!). I looked at the expfactory docs a while back, and I'm especially unsure about how data are collected (e.g. is there a standard HTTP API?), that's something I would love to learn more about.

If you can spare a moment, I'd love to chat, or if there is an example I could copy, I would be happy to see how far I can get.

Thanks again for getting in touch, and all the best,

-Felix

PS: Thanks for your super-kind words regarding the UI! Bootstrap does the heavy lifting for us; the yellow was just an attempt to make the standard JS yellow nicer. I never made the association with marigold, but it fits perfectly; I guess we now have an official project flower 🌼!

vsoch commented 6 years ago

hey @FelixHenninger woohoo! You can get an example running quickly with Docker (are you familiar with Docker?) Try any of the following:

docker run -p 80:80 vanessa/expfactory-experiments start
docker run -p 80:80 vanessa/expfactory-games start
docker run -p 80:80 vanessa/expfactory-surveys start

The infrastructure provided with the examples above is ready to go with a containerized web application (meaning you don't need to install dependencies on your host, other than have Docker), a full nginx web server to go along to serve the flask application with gunicorn. And importantly, a ton of experiments / games / surveys! The idea with the commands above is that I (some researcher) have generated a docker container with my particular experiments. I use the container to do my own work, then publish and provide others with the container so they can reproduce it. The generation of the container looks something like the following:

  1. There is a container with the expfactory software that is used to list available experiments in the library, which is just a static repository pointing to individual experiment repositories, and the repository itself tests the contributions and then automatically produces the web interface you see and an API to access it programatically. For example, try:
docker run vanessa/expfactory-builder list

and you'll see them!

  1. With this builder image (a user) can then generate a custom Dockerfile, and the Dockerfile is the build recipe for the container (an encapsulated environment with all dependencies and experiments!). Generating the Dockerfile for the container is just one line of code to select experiments from the library:
docker run -v $PWD/data vanessa/expfactory-builder build test-task tower-of-london

and then you build it, which is usually just:

docker build -t vanessa/my-awesome-experiments .

And if you use Github, you can just have an automated build from the repository with the Dockerfile, and don't need to do that.

For the above three examples, that is how a second researcher would then run / use the containers on Docker hub. We map port 80 so that the container web server appears on the local machine. Depending on the database type, the user might give other arguments to start the container, or map a folder to the host that has some saved data or logs.

Integration with expfactory would be absolutely terrific! We've been in touch with @teonbrooks, and we have an experimental export to his expfactory-server package (the export options can be accessed in the UI from the drop-down menu next to the save button).

Teon actually arrived just as I was leaving, and had a different project (OpenEXP) that I think he wanted to expand upon, but we never were successful to combine the two, so I can't comment on expfactory-server, other than it looks like it's a bit of code to start a nodejs web server. I think @teonbrooks has some things in the works (I can't comment) and has been super busy with Mozilla (congrats again Teon!)

I've read your articles, but I haven't setup or used expfactory myself --

Haha, no worries, this is actually probably a good thing because it would be confusing. The expfactory paper talks about a different Dockerized web application that my old lab ultimately wasn't able to open up to the public.

in principle though, it should be absolutely no problem to create an export option that includes, e.g. a Dockerfile and/or a config.json and a folder structure for static media. If you could walk me through the necessary steps, I would be thrilled to do the work on our side (or of course, if you'd like to contribute, please be very warmly invited!).

Woohoo! You don't need a Dockerfile or anything, the experiments that plug into expfactory are just static html/css with a config.json and an index.html. The tweak that would need to be done is how they save data and move to a next endpoint - whatever data (json object) you have in the browser should (when the experiment finished) POST it to the endpoint /save (expfactory takes care of csrf and all that) and then navigates to /next on successful POST. If it's the case that you have a main index.html that must conform to some other standard, then we can figure out how to transform the two. It could be as simple a a replacement, or even having a separate expfactory.html that Expfactory looks for first (instead of index.html). If you have an experiment export I could look at, this would be a good start!

For what an "Expfactory experiment" is, take a look at any of the experiments here --> to get a sense - they are super simple! Some are traditional surveys that POST a form when finished, others are more substantial phaser (webGL) games, and the third set are jspsych that maintain a data structure throughout the experiment (and again POST) I tried to make it as lenient as possible in terms of "what the experiment has to look like" to support any kind of web based thing. What I haven't developed yet (that I intend to when users ask for it) is how to customize experiments / variables on build. For reproducibility, it's important that most customization happens at this time (and not runtime) so paradigms can be reproduced exactly as intended by the creator.

I wrote up a super detailed walkthrough of steps to add an experiment --> https://expfactory.github.io/expfactory/contribute#contribute-an-experiment. If you have a static experiment repository that I can do a PR to, I would love to give it a first go! It would come down to:

That's pretty much it, once tests pass and I merge, it plugs in automatically to all the things I mentioned above.

I looked at the expfactory docs a while back, and I'm especially unsure about how data are collected (e.g. is there a standard HTTP API?), that's something I would love to learn more about.

Results are saved with just a simple POST to the server, and participant sessions maintained via the browser. On the back end, the particulars of the database (or just saving to flat files) are determined by the user, with the simplest option as default (e.g., save to local filesystem with default study id of "expfactory"). I am working on another branch to add more substantial databases, and might even have some "quick" deployment options to cloud servers / places, but likely I'd want to get user feedback first.

I think that's the most of it? If you want to find a repository with an example experiment, I can do a PR to generate a config.json, and then make a proposal for how to go about the main index.html.

FelixHenninger commented 6 years ago

Wow, thanks so much @vsoch for taking the time to respond in such detail! I had been looking through your new documentation (which is awesome), and together with your super-helpful explanations this shouldn't be hard at all! I'll give it a go tomorrow (it's getting late around here and I have to stop myself), excited to make this happen.

I take it that the POST endpoint writes directly to a file, so I'd probably best send CSV-formated data in the data field? Is there something that should happen if the POST isn't successful? (A retry, I guess?)

We (that is, our awesome RAs) have been building a set of example studies, and hopefully there are a few that aren't already in your respository -- I'd be more than happy to contribute to your collection!

Thanks again, and the very best regards halfway around the world!

vsoch commented 6 years ago

hey @FelixHenninger ! The post can take whatever you give it, I would say it's good practice to put an unknown data structure under a consistent "known", for example let's say I have serialized csv:

csv_data="FIELD1,FIELD2\narg1,arg2\n"

It would be good practice (at least for this simple strategy) to post it like this:

$.ajax({
              type: "POST",
              url: '/save',
              data: { "data": csv_data },
              success: function(){ document.location = "/next" },
              dataType: "application/json",
              error: function(err) { jsPsych.data.localSave('antisaccade_results.csv', 'csv')}
           });

but my preference (because I like python where the JSON maps to a dictionary) and because a lot of this data isn't relational, is json:

json_data = {"field1": "arg1", "field2": "arg2"}
$.ajax({
              type: "POST",
              url: '/save',
              data: { "data": json_data },
              success: function(){ document.location = "/next" },
              dataType: "application/json",
              error: function(err) { jsPsych.data.localSave('antisaccade_results.csv', 'csv')}
           });

and you will notice that given success, I navigate to /next. I think for now a retry isn't necessary - it's still too early in development to implement scaling. But I like that you are thinking about this! I can tell you with experience in making web application type things that use APIs, retrying (with exponential backoff) is really important.

Given any error, what I'm doing is then saving a local result file (and you can do this however you like, I am using csv for jspsych). It's a silly strategy, but it ensures that a static thing will run and output a result if it's served on Github pages (without a server) or plugged into one. There are also a lot of labs that don't want to use a server, but just want to go to the static github pages experiment and run it, and save to the browser.

The POST (eventually) writes to a file (for a filesystem save) or to a database field as a string (for sqlite, postgres, or mysql) and either way the content is provided as is. Well, with some added checks that you aren't trying to do anything malicious - and I'm going to work a bit more on that too.

Awesome!! No worries about the time difference, it's a weekend! I'll be around and look forward to helping with some experiments (and I'd love to demo the RAs thus far!). It will be awesome to have a nice registry for these things, and an easy plug in to lab.js to make new experiments.

FelixHenninger commented 6 years ago

Hej Vanessa,

just a quick heads-up: I'm working on this, and almost there (it took a few more steps because I wanted to also capture the metadata you collect in config.json, which will also help with implementing Teon's great study metadata initiative) -- I can test all individual parts in isolation (e.g. POSTing data, etc.), but I'd like to see the whole thing running; is there a way to generate an experiments container from local studies (as opposed to those in the central repo)? (if I'm missing something, or if there's a better place I could start investigating, please do let me know, I love your excellent explanations, but I wouldn't want to keep you from more important things)

-Felix

vsoch commented 6 years ago

hey @FelixHenninger ! I think the original config.json captured most of the metadata, with some great additions by Teon for instructions that I've added to the Experiment Factory too. To answer your question, yes! The install command can be a folder, a config.json or a github repository. I just didn't properly write any sort of documentation. I just created an issue and will write it up promptly - I'm debugging some Nvidia / docker / singularity container stuffs at the moment. Stay tuned!

FelixHenninger commented 6 years ago

Ah, perfect, I'll figure it out from there. The docker-based mechanism is just fantastic -- thanks so much!

vsoch commented 6 years ago

okay, the new builder is pushed, code is ready, as are the docs! Give it a go!

https://expfactory.github.io/expfactory/generate#local-experiment-selection

Let me know if something isn't clear! I hopefully did a clear enough walkthrough so it's ok, but I go back and forth and change things so I could have not well articulated a point enough.

vsoch commented 6 years ago

and just a heads up I'm going to go for a run, so likely back in a few hours! Post here is any issues arise and I can help.

vsoch commented 6 years ago

hey @FelixHenninger ! Just wanted to check in to see if you needed any help? We just added ordering to the portal, and I'm working on a PR that will allow for a "headless" installation for users to login before taking the experiments. If you need any help with the experiment testing please don't hesitate to ask for it!

FelixHenninger commented 6 years ago

Hej Vanessa,

thanks a lot for checking in! This is entirely my bad -- I've been swamped with work over the past 10 days, and have been sitting on commits. There's some repo housekeeping/rebasing that needs to be done, but I'll push as quickly as I get around to it, hopefully over the next days, in the worst case it will take another week.

Sorry for the delay, I haven't forgotten this!

-Felix

FelixHenninger commented 6 years ago

Hi Vanessa,

thanks so much for your patience with this issue! I've been working on this on and off over the last two weeks, and export for expfactory is now included in the builder interface. The integration has been super-nice -- the new expfactory is a real joy to use!

If you'd like to try it, you can select the corresponding export option in the toolbar's i/o dropdown. Here's a basic stroop task you can test it with; if you'd rather skip the build step, I've exported the same study for expfactory, so the file structure corresponds to that of your other studies.

So far, the integration works in that the studies can be included in experiment images via the command line, as you describe (thanks a lot for the pointers!). However, I haven't been able to get the data storage to work (even though it works fine for the included studies): When pushing the data to /save, the server responds with an error code 400, bad response. I'm guessing that this is in part because I'm using fetch instead of jQuery's $.ajax, but I haven't been able to figure out the root cause.

A minimal example that replicates the issue is the following:

fetch('/save', {
  method: 'post',
  headers: {
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    data: 'foo',
  })
}).then(
  response => console.log(response)
)

To trigger data transmission in a lab.js study without running through an entire study, you can manually run study.options.datastore.transmit('/save').then(response => console.log(response)), which uses very similar code.

Looking at the corresponding view in the expfactory code, it seems that you only return the response codes 200 or 403 -- as I'm seeing 400, it seems like the error might be coming from flask rather than your code?

I'd love to get this to work -- if you have any hints or ideas, I would really appreciate a pointer. Thanks again for your super-generous support!

All the very best,

-Felix

vsoch commented 6 years ago

I'm glad you hit this bug, because I was having trouble with it! I was having trouble with how to respond (from the server) using flask to then trigger the correct part of the ajax logic. It seemed to, no matter what i returned, call the success function, so I did a check for a status code in both blocks. I am finishing up some things but will take a look at what you shared later today - really excited to see this integration!

FelixHenninger commented 6 years ago

Wow, thanks for your instant response! I'm glad I'm not only being a burden here :-) .

With regard to the success handler being called regardless of the response, I gather you've been also been working with fetch for data transmission? Because that's something I (also?) grappled with: It took me a while to figure out that in the fetch API, the success handler is called (the promise resolves) not based on the error code, but on the network transmission being completed. For expfactory, I've been using the following logic to move to the next page:

fetch('/save', {
  /* ... options as above ... */
}).then((response) => {
  if (response && response.ok) {
    window.location = '/next'
  }
})

Along the same lines, I've been wondering what to do if this process fails, apart from retrying -- the jspsych-based studies currently offer the data csv for download, but I'm not sure about handing participants their entire data. I haven't had a good idea for this, though.

Ok, now on to an epic rebase to get all of these changes into the repo!

vsoch commented 6 years ago

okay! I promise to be able to test soon, at least in the next few days / this week!

FelixHenninger commented 6 years ago

Oh, I wouldn't want to add to your stress in the holiday season, especially after this has taken me so long. Please take your time, I'll keep investigating also. Thanks so much for all of your support thus far!

vsoch commented 6 years ago

ok will do :) thanks @FelixHenninger :snowflake: :snowman: :christmas_tree: :deer:

vsoch commented 6 years ago

oh wow this is a beautiful task!

image

If you look at the image above, I think I see the issue (two parts). The first is a lack of jquery - the cookie to validate the requests (csrf_token) is added via ajaxSetup, and this is done with jquery (which it doesn't look like is included in the page). I think there must be a vanilla javascript way to do this - do you know?

The second issue (the reason the data post doesn't work) is entirely that - the server isn't getting the token and it's giving you a "sorry, you don't have permissions!" response!

So if we can figure out a javascript way to add the csrf_token to any general POST (and I would fix this on the expfactory server) I think that will make everything good to go! I've started a little expfactory-labjs repository:

https://github.com/expfactory/expfactory-labjs

and am making a nice walk through (and when it's done will link it from the main docs) so others can easily make and then use experiments. I'm super excited about this!

FelixHenninger commented 6 years ago

Ah, I see, thanks so much for debugging this! I hadn't spotted the token -- I'll try to add it to the request headers, though I'd rather not include jQuery as a dependency unless it's necessary for other purposes; the csrfToken variable should be accessible globally regardless (I think?).

Thanks so much for preparing the fantastic installation walk-through! It's totally awesome. I'll add a section to our docs, and look through things in detail over the coming weeks.

Wow, I'm excited to see this happen, thanks again for making it possible!

vsoch commented 6 years ago

Ah you shouldn't need to! The addition is added by the experiment factory software - your experiment is basically rendered with some additional code. So if we are able to figure out just a javascript way of adding that header, it will be sent automatically when you POST (even without anything hard coded in your experiments). That's why I was using ajax before send / setup, so that any experiment could be plugged in with a general ajax post and not need to worry about the server authentication.

I'm pooped too - it's 8:30pm here and I should really stop working! I'll take a look at this with fresh eyes tomorrow and I think with some google searching it should be relatively easy to find a solution!

vsoch commented 6 years ago

here is what the template looks like so it makes sense :)

{% include experiment %}

<script type="text/javascript">
    var csrf_token = "{{ csrf_token() }}";

    $.ajaxSetup({
        beforeSend: function(xhr, settings) {
            if (!/^(GET|HEAD|OPTIONS|TRACE)$/i.test(settings.type) && !this.crossDomain) {
                xhr.setRequestHeader("X-CSRFToken", csrf_token);
            }
        }
    });
</script>
vsoch commented 6 years ago

Woo I just got it! I've been struggling with this all morning, haha. I'll share what I got working and then we can talk about the best way to implement. This is the fetch call that was successful:

fetch('/save', {
    method: 'POST',
    credentials: 'same-origin',
    redirect: 'follow',
    agent: null,
    headers: {
        "Content-Type": "text/plain",
        "X-CSRFToken": csrf_token
    },
    timeout: 5000
  }).then(function(response) {
    console.log(response);
  })

The key to that would be to get the csrf_token variable in there. I had just manually entered it, but the page will render it automatically if you set it like:


fetch('/save', {
    method: 'POST',
    credentials: 'same-origin',
    redirect: 'follow',
    agent: null,
    headers: {
        "Content-Type": "text/plain",
        "X-CSRFToken": "{{ csrf_token() }}",
    },
    timeout: 5000
  }).then(function(response) {
    console.log(response);
  }).then(function(data) {
    console.log(data);
});

Note the `{{ csrf_token() }} is a flask short hand to render something in a template (jinja2 syntax) and then the () means we are calling a function, the function to generate the token. Since this is something that needs to be rendered, it would need to be part of the main index.html file.

So to proceed, here is what I'm thinking:

  1. Can we move the logic / wherever the csrf_token variable is defined into the index? the chunk can still stay in script.js for example, but we would want the variable defined in index.html
  2. When that is done, then applying the logic above to the transmit object should work!

Let me know if this is possible and if I can help! After that I'll make a demo container for you to try out with the updated stroop.

FelixHenninger commented 6 years ago

Wow, thanks so much for solving this! I'll see if I can adopt your modifications tomorrow -- since csrf_token is a global variable, I think it should be possible to inject it into the request headers when the data are sent. That way, leaving the code as it is at the bottom of index.html ought to work, though I'll have to check whether the value is accessible when the library is set up.

It's a shame that fetch support across browsers is still a bit patchy -- otherwise, would it have been an option to define a wrapper function around the fetch call at the bottom of index.html that we could call from the library, e.g. expfactory_save(data)? That way, data processing would fall entirely into the domain of the expfactory, and you would be free to adapt it later without the libraries having to change their integration.

vsoch commented 6 years ago

yes I would have preferred too a function akin to "ajaxSetup" that would allow me to (separately from your or any code that uses fetch) prepare the headers in advance / separately.

I was also avoiding having some custom function expfactory_save() because I wanted it to be the case that any general web-based experiment could plug in, and minimally the endpoint (/save) doesn't exist and then the data is saved locally.

But now I'm thinking about it - and maybe it would be reasonable to give experiment creators an option to use a set of functions to save, and then not think about it. Actually I really like the idea! I could even possibly do something like check if the function is in the window first.

I noticed that you are really good with object oriented javascript - could I ask for your help on this? Even a general skeleton to get me started for what the data structure would look like would be so helpful! And in that you can embed the list of "what I think are important functions / what I'd want to use."

FelixHenninger commented 6 years ago

Hej Vanessa,

a very happy new year! I just saw that you've been prolific over the holidays. I'm sorry I couldn't keep up -- I've been writing and preparing teaching, and have had to pace myself with development.

The X-CSRFToken header is now included in the data transmissions (or is it X-CSRF-Token? Both seem to work, but the latter spelling is used by nginx when it sends data), so the POST request to /save works, and the study moves on to the next task.

There's still an error because jQuery is missing -- would it be possible to not include the call to $.ajaxSetup when the study template is lab.js? Otherwise, might it be a solution if the injected code could check for the presence of the $ variable before calling the ajaxSetup method?

Finally, although the transmission is working now, the all-important data is not being saved ;-) . This is probably a mistake on my part -- I'm sending raw JSON data in the POST request to /save, with the information included in the data key (of the JSON object); jsPsych form-encodes the participant's data and sends it as the content of the data URL parameter. I'm guessing that this is not entirely intentional, but simply the default behavior of the jQuery.ajax helper function. I can change that, but I'd like to make sure that's something you want?

I'd be super-happy to help with whatever object-oriented plans you have, are you thinking about data formats or a helper library for expfactory?

Ok, that's all I can think of for now -- I hope this makes sense, and I'm looking forward to hearing from you!

Best,

-Felix

PS: Loved your JOSS paper! That's a great way to publish scientific software, and one I hadn't really considered before.

vsoch commented 6 years ago

hey @FelixHenninger ! Totally okay to take time during the holidays :) I think with your help I know exactly what to do to finish up this first go - and I definitely will want your feedback.

For the object oriented functions (to answer your question) I was thinking that likely the first relative need is a helper library to give to third parties to easily plug into expfactory). I'm actually still undecided about this, because I want to make it so that an external experiment software doesn't have dependencies other than submission to a particular endpoint. If you were able to add grabbing the token (and I can test this and see if it works ok!) it is definitely easy and do-able to add the check for jquery (and I will test this!) and we might not even need the extra library. But maybe it could be an option (but not required). TBA!

Anyway, I don't have an update at this moment, but wanted to let you know that I'm going to be a bit longer this time in getting back to you, I'm flying out to get a surgery in under a week and likely will be in some state of sleeping until late in the month, lol. It's also good so that I can pause on adding to expfactory until the initial JOSS review finishes up (it's really a fantastic avenue for publishing scientific software - an invaluable resource with very good review process too!)

So! TLDR - expect to hear from me end of January, and we will add the LabJS integration, and then decide on the next cool and fun thing to do next :) (still okay to say) Happy 2018!

FelixHenninger commented 6 years ago

I want to make it so that an external experiment software doesn't have dependencies other than submission to a particular endpoint. If you were able to add grabbing the token (and I can test this and see if it works ok!) it is definitely easy and do-able to add the check for jquery (and I will test this!) and we might not even need the extra library.

Yeah, that makes total sense -- I forgot to write it up, but I played with the idea of maybe setting the token as a cookie (I think Django does it that way); that way, the token would not need to be injected into the page source (the obvious downside being the need for cookies). But let's discuss that another day!

I'm going to be a bit longer this time in getting back to you, I'm flying out to get a surgery in under a week and likely will be in some state of sleeping until late in the month, lol.

Wow, that sounds like a big thing! I sincerely hope all is well, and am very much looking forward to having you around again. For the meantime, my heartfelt best wishes go out to you -- if it's any use, please know that there's someone here rooting for you from afar.