expfactory / expfactory-python

python module for managing experiment factory javascript experiment files, batteries to deploy them to (eg, psiturk), and virtual machines to host the compilation of those things.
http://expfactory.readthedocs.org/
MIT License
4 stars 11 forks source link

Battery folder cloned regardless of --battery argument #132

Closed earcanal closed 6 years ago

earcanal commented 7 years ago

[Edited]

According to the local battery instructions, it should be possible to avoid cloning the battery from github by manually cloning and giving an argument to --battery. The example below seems to demonstrate that vm.py clones the battery regardless of whether a local clone has been specified with --battery. This slows down local experiments and will fail if there's no network.

$ expfactory --run --experiments zoning_out_task --battery expfactory-battery --folder expfactory-experiments
Generating custom battery selecting from experiments for maximum of 99999 minutes, please wait...
^CTraceback (most recent call last):
  File "/usr/local/bin/expfactory", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python2.7/dist-packages/expfactory/scripts.py", line 117, in main
    time=args.time)
  File "/usr/local/lib/python2.7/dist-packages/expfactory/views.py", line 119, in run_battery
    time=time)
  File "/usr/local/lib/python2.7/dist-packages/expfactory/battery.py", line 89, in generate_local
    add_surveys=False)
  File "/usr/local/lib/python2.7/dist-packages/expfactory/battery.py", line 29, in generate_base
    tmpdir = custom_battery_download()
  File "/usr/local/lib/python2.7/dist-packages/expfactory/vm.py", line 37, in custom_battery_download
    download_repo(repo,"%s/%s/" %(tmpdir,repo))
  File "/usr/local/lib/python2.7/dist-packages/expfactory/vm.py", line 23, in download_repo
    return Repo.clone_from("https://github.com/expfactory/expfactory-%s" %(repo_type), destination)
  File "/home/paul/.local/lib/python2.7/site-packages/git/repo/base.py", line 942, in clone_from
    return cls._clone(git, url, to_path, GitCmdObjectDB, progress, **kwargs)
  File "/home/paul/.local/lib/python2.7/site-packages/git/repo/base.py", line 895, in _clone
    (stdout, stderr) = proc.communicate()
  File "/usr/lib/python2.7/subprocess.py", line 800, in communicate
    return self._communicate(input)
  File "/usr/lib/python2.7/subprocess.py", line 1417, in _communicate
    stdout, stderr = self._communicate_with_poll(input)
  File "/usr/lib/python2.7/subprocess.py", line 1471, in _communicate_with_poll
    ready = poller.poll()
KeyboardInterrupt
vsoch commented 7 years ago

hey @earcanal ! I'm definitely glad to help with this. Are you providing an argument for both --experiments and --battery? If either one isn't specified, that would cause the download to happen. If not, could you provide the command you are doing so that I can reproduce the error and fix it up?

earcanal commented 7 years ago

Yes. Markdown error was cropping command line in issue (now fixed).

vsoch commented 7 years ago

ah, ok, so the bug is that the software is wanting to also download surveys and games, which were added later than just experiments:

    if experiment_repo == None or battery_repo == None or survey_repo == None or game_repo == None:
        tmpdir = custom_battery_download()
        if experiment_repo == None:
            experiment_repo = "%s/experiments" %(tmpdir)     
        if battery_repo == None:
            battery_repo = "%s/battery" %(tmpdir)     
        if survey_repo == None:
            survey_repo = "%s/surveys" %(tmpdir)     
        if game_repo == None:
            game_repo = "%s/games" %(tmpdir)  

What is your time frame for this? I would like to refactor things a bit to be more modular, but that would take some time. I can provide a quick fix if you need asap.

earcanal commented 7 years ago

Yeah, that's the conclusion I came to, but didn't want to patch as I've only had a quick poke around the code. I'm likely to need to run an experiment in a location with unreliable network connectivity in 7 days.

vsoch commented 7 years ago

hey @earcanal - let's get you something working asap! Here is a patched version to try - I made it so the default doesn't use surveys or games.

https://github.com/vsoch/expfactory-python/tree/patch/surveys-games

git checkout -b patch/surveys-games https://github.com/vsoch/expfactory-python
cd expfactory-python
python setup.py install
expfactory --run --experiment stroop

Note that this is a branch on my fork, not the main one here. I put it together pretty quickly, and I've added the start of a better logger for errors. I'd like to be making the entire software better soon - let me know if this works for you, and I can tweak as necessary!

earcanal commented 7 years ago

Hi @vsoch. Thanks for rapidly :) looking at this. Here's what I did to install

git clone https://github.com/vsoch/expfactory-python.git
git checkout -b patch/surveys-games
git branch --set-upstream-to=origin/patch/surveys-games patch/surveys-games
Branch patch/surveys-games set up to track remote branch patch/surveys-games from origin.
git pull
Updating c4fe469..00f4ced
...
python setup.py install

The following command line appears to use local repos (experiment starts instantly), although some DEBUG confirming that would be useful. This will really speed up my develop/test cycle as I don't have to wait for the clone every time I test an experiment. Thanks!

$ expfactory --run --experiments zoning_out_task --battery expfactory-battery --folder expfactory-experiments
DEBUG Generating custom battery selecting from experiments for maximum of 99999 minutes, please wait...
Found 59 valid experiments
Preview experiment at localhost:9218
Created new window in existing browser session.

Were you suggesting above that --experiment shouldn't pull a remote repo? It does

$ expfactory --run --experiment stroop
No battery folder specified. Will pull latest battery from expfactory-battery repo
No experiments, games, or surveys folder specified. Will pull latest from expfactory-experiments repo
DEBUG Generating custom battery selecting from experiments for maximum of 99999 minutes, please wait...
^C^CTraceback (most recent call last):
  File "/usr/local/bin/expfactory", line 9, in <module>
    load_entry_point('expfactory==2.5.47', 'console_scripts', 'expfactory')()
  File "/usr/local/lib/python2.7/dist-packages/expfactory-2.5.47-py2.7.egg/expfactory/scripts.py", line 168, in main
    time=args.time)
  File "/usr/local/lib/python2.7/dist-packages/expfactory-2.5.47-py2.7.egg/expfactory/views.py", line 141, in run_battery
    time=time)
  File "/usr/local/lib/python2.7/dist-packages/expfactory-2.5.47-py2.7.egg/expfactory/battery.py", line 138, in generate_local
    add_surveys=False)
  File "/usr/local/lib/python2.7/dist-packages/expfactory-2.5.47-py2.7.egg/expfactory/battery.py", line 77, in generate_base
    tmpdir = custom_battery_download(repos=download_repos)
  File "/usr/local/lib/python2.7/dist-packages/expfactory-2.5.47-py2.7.egg/expfactory/vm.py", line 70, in custom_battery_download
    download_repo(repo,"%s/%s/" %(tmpdir,repo))
  File "/usr/local/lib/python2.7/dist-packages/expfactory-2.5.47-py2.7.egg/expfactory/vm.py", line 56, in download_repo
    return Repo.clone_from("https://github.com/expfactory/expfactory-%s" %(repo_type), destination)
  File "/usr/local/lib/python2.7/dist-packages/GitPython-2.1.1-py2.7.egg/git/repo/base.py", line 925, in clone_from
    return cls._clone(git, url, to_path, GitCmdObjectDB, progress, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/GitPython-2.1.1-py2.7.egg/git/repo/base.py", line 878, in _clone
    (stdout, stderr) = proc.communicate()
  File "/usr/lib/python2.7/subprocess.py", line 800, in communicate
    return self._communicate(input)
  File "/usr/lib/python2.7/subprocess.py", line 1417, in _communicate
    stdout, stderr = self._communicate_with_poll(input)
  File "/usr/lib/python2.7/subprocess.py", line 1471, in _communicate_with_poll
    ready = poller.poll()
KeyboardInterrupt
vsoch commented 7 years ago

you still need to provide the battery folder with the experiment to not download.

earcanal commented 7 years ago

I think there may be a regression with that patch. I got this today when I came to run a survey

$ expfactory --run --battery /home/paul/src/expfactory-battery --folder /home/paul/src/expfactory-surveys --survey five_facet_mindfulness_survey
WARNING Currently only support running one survey, will run first in list.
DEBUG Deploying survey five_facet_mindfulness_survey
ERROR Invalid survey five_facet_mindfulness_survey not found in surveys repo!
Please specify list of comma separated experiments with --experiments

You'll want to test further with the following patch which got my surveys working again:

$ diff battery.py battery.py.orig 
57c57
<                   add_experiments=True,add_surveys=True,add_games=False,battery_repo=None,warning=True):
---
>                   add_experiments=True,add_surveys=False,add_games=False,battery_repo=None,warning=True):
vsoch commented 7 years ago

yes, I didn't know you were using surveys, so I disabled adding them (which was being done by default) so the repo wouldn't be downloaded. The fix would be to have this also determined by a variable from the command line - or your fix works too! Thanks for the update!

earcanal commented 7 years ago

I'm not sure if this is another regression, but expfactory seems to intermittently run experiments in a different order from that specified, e.g.

expfactory --run --battery expfactory-battery --folder expfactory-experiments --experiments test_task,attention_network_task

wrong_order

I don't know enough about how expfactory is built to revert it so I can test it this issue is also in the previous version. This is what my current script looks like.

#!/usr/bin/python
# EASY-INSTALL-ENTRY-SCRIPT: 'expfactory==2.5.47','console_scripts','expfactory'
__requires__ = 'expfactory==2.5.47'
import sys
from pkg_resources import load_entry_point

if __name__ == '__main__':
    sys.exit(
        load_entry_point('expfactory==2.5.47', 'console_scripts', 'expfactory')()
    )

FWIW, here's the script I'm using for a study which takes into consideration various issues (e.g. expfactory/expfactory-surveys#63) I'm currently working around.

#!/bin/bash

# FIXME: --folder _requires_ a full pathname for surveys (not for experiments) - https://github.com/expfactory/expfactory-surveys/issues/63
SURVEY='expfactory --run --battery ./expfactory-battery --folder /path/to/my/expfactory-surveys --survey'
EXPERIMENT='expfactory --run --battery ./expfactory-battery --folder expfactory-experiments --experiments'

# expfactory only supports one survey at a time :(
# I've also seen expfactory _intermittently_ run a battery in a different order to that specified with --experiments, so I feel more confident running everything one at a time
# This isn't ideal, as it requires a manual ^c to move onto the next item
$SURVEY simple_demographics_survey_2 && $SURVEY five_facet_mindfulness_survey && $EXPERIMENT attention_network_task_2 && $EXPERIMENT zoning_out_task && $EXPERIMENT attention_network_task_3 && $SURVEY state_mindfulness_survey
vsoch commented 7 years ago

hey @earcanal ! This is expected behavior - the experiments are presented randomly so you don't get order effects in your results.

earcanal commented 7 years ago

Thanks. What do think of expfactory/expfactory-python#134? If I get a moment I'll implement and submit a pull request (unless you get to it first).

vsoch commented 7 years ago

hey @earcanal - that sounds good! I'm slowly working on a better version of expfactory, one that uses more of a "package manager" model, but for experiments (see the development branch). It might be a few months, but I promise a lot of these annoyances will be fixed up. I'm no longer a grad student so I work on this in random off time, but it's important to me, so I'll definitely be on it! :O) I really appreciate the feedback!

earcanal commented 6 years ago

Obsoleted by new version of The Experiment Factory.