devinat1 commented 4 months ago

I have installed the packages in requirements.txt, but additional packages needed are django, djaa_list_filter, and others. Please update your requirements.txt file to match what is needed to run all respective packages.

danyaljj commented 4 months ago

I don't think we need django really. Turkle (which we clone internally) needs it but Turkle should bring in its own dependencies and requirements.txt. If that's not the case, feel free to elaborate (or even better, send a PR).

devinat1 commented 4 months ago

As a user, I am unable to run ./1_run_website.sh as specified in the README for setup. A minimal setup for the project to get things running would be appreciated. I wish to understand how this project works by running it.

danyaljj commented 4 months ago

Thanks! We'll look into this. CC: @klxu03

klxu03 commented 4 months ago

@devinat1 did you initially pip install, and then cd into src before running 1_run_website.sh? It's important that you run the script in /src, instead of running the script /src/1_run_website.sh?

I just ran a GitHub action that tries to set up the project from scratch: https://github.com/JHU-CLSP/turking-bench/actions/runs/8511605319/job/23311554869 and it seems to work

Here is the script that triggers the GitHub action: https://github.com/JHU-CLSP/turking-bench/blob/main/.github/workflows/tap_test.yml

It effectively pip install -r requirements.txt, and then runs these set of commands (the same ones as in the README):

Runs a set of commands using the runners shell

  - name: run Django/Turkle server
    run: |
      echo 'Moving to src directory'
      cd src
      echo 'Clone Turkle'
      ./1_run_website.sh & sleep 30
      echo 'Generate the input files'
      python 2_generate_input_csv.py
      echo 'Upload the tasks'
      python 3_upload_tasks.py

I was wondering if you'd be willing to provide an error log. Are you trying to run this on a server that doesn't have access to peripherals like a monitor and mouse?

devinat1 commented 4 months ago

Yes, I am running within src. My commands are as follows: starting from root directory: python3 -m venv venv source venv/bin/activate pip install -r requirements.txt cd src ./1_run_website.sh

Then I get the following:

(venv) bond@razer:~/cs590/web_agent_datasets/turking-bench/src$ ./1_run_website.sh 
Using poetry and python3
Directory Turkle exists, running pre-existing server
Exception in thread django-main-thread:
Traceback (most recent call last):
  File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/home/bond/.local/lib/python3.10/site-packages/django/utils/autoreload.py", line 64, in wrapper
    fn(*args, **kwargs)
  File "/home/bond/.local/lib/python3.10/site-packages/django/core/management/commands/runserver.py", line 125, in inner_run
    autoreload.raise_last_exception()
  File "/home/bond/.local/lib/python3.10/site-packages/django/utils/autoreload.py", line 87, in raise_last_exception
    raise _exception[1]
  File "/home/bond/.local/lib/python3.10/site-packages/django/core/management/__init__.py", line 398, in execute
    autoreload.check_errors(django.setup)()
  File "/home/bond/.local/lib/python3.10/site-packages/django/utils/autoreload.py", line 64, in wrapper
    fn(*args, **kwargs)
  File "/home/bond/.local/lib/python3.10/site-packages/django/__init__.py", line 24, in setup
    apps.populate(settings.INSTALLED_APPS)
  File "/home/bond/.local/lib/python3.10/site-packages/django/apps/registry.py", line 91, in populate
    app_config = AppConfig.create(entry)
  File "/home/bond/.local/lib/python3.10/site-packages/django/apps/config.py", line 228, in create
    import_module(entry)
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1004, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'guardian'

My theory is that it is because I am doing a pip install within a venv.

devinat1 commented 4 months ago

I tried installing without a venv now, and I am additionally getting ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. jupyterlab-server 2.25.2 requires requests>=2.31, but you have requests 2.28.2 which is incompatible. which may or may not be related.

devinat1 commented 4 months ago

I can confirm now that I am getting the same error with or without a venv.

klxu03 commented 4 months ago

@devinat1 can you try rm -rf Turkle and and re-running the script? "Directory Turkle exists, running pre-existing server" shouldn't be seen on an initial install

devinat1 commented 4 months ago

When I delete Turkle and rerun, now I get:

(venv) bond@razer:~/cs590/web_agent_datasets/turking-bench/src$ ./1_run_website.sh & sleep 30
[2] 12630
Using poetry and python3
Directory Turkle exists, running pre-existing server
Exception in thread django-main-thread:
Traceback (most recent call last):
  File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/utils/autoreload.py", line 64, in wrapper
    fn(*args, **kwargs)
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/core/management/commands/runserver.py", line 125, in inner_run
    autoreload.raise_last_exception()
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/utils/autoreload.py", line 87, in raise_last_exception
    raise _exception[1]
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/core/management/__init__.py", line 394, in execute
    autoreload.check_errors(django.setup)()
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/utils/autoreload.py", line 64, in wrapper
    fn(*args, **kwargs)
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/__init__.py", line 24, in setup
    apps.populate(settings.INSTALLED_APPS)
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/apps/registry.py", line 91, in populate
    app_config = AppConfig.create(entry)
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/apps/config.py", line 193, in create
    import_module(entry)
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
(venv) bond@razer:~/cs590/web_agent_datasets/turking-bench/src$ ls
1_ia1_run_website.sh       4_run_evaluation.py  __init__.py         evaluate_model.py                   offline_baselines.py  utils
1_rockfish_run_website.sh  5_dump_features.py   amt                 evaluation                          run_single.py
1_run_website.sh           TAP_tests.py         cleanup.py          evaluation_class.py                 screenshots
2_generate_input_csv.py    TAP_tests_random.py  default_definition  img                                 test_ollama.py
3_upload_tasks.py          Turkle               dump_partition.py   lunch_test_tasks_on_amt_sandbox.py  tests.py
(venv) bond@razer:~/cs590/web_agent_datasets/turking-bench/src$ rm -rf Turkle/
(venv) bond@razer:~/cs590/web_agent_datasets/turking-bench/src$  ./1_run_website.sh & sleep 30
[3] 13223
Using poetry and python3
Cloning into 'Turkle'...
remote: Enumerating objects: 5447, done.
remote: Counting objects: 100% (869/869), done.
remote: Compressing objects: 100% (383/383), done.
remote: Total 5447 (delta 554), reused 756 (delta 471), pack-reused 4578
Receiving objects: 100% (5447/5447), 8.40 MiB | 5.54 MiB/s, done.
Resolving deltas: 100% (3511/3511), done.
/home/bond/cs590/web_agent_datasets/turking-bench/venv/bin/python3: can't open file 'manage.py': [Errno 2] No such file or directory
Traceback (most recent call last):
  File "/home/bond/cs590/web_agent_datasets/turking-bench/src/Turkle/manage.py", line 34, in <module>
    execute_from_command_line(sys.argv)
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/core/management/__init__.py", line 442, in execute_from_command_line
    utility.execute()
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/core/management/__init__.py", line 416, in execute
    django.setup()
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/__init__.py", line 24, in setup
    apps.populate(settings.INSTALLED_APPS)
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/apps/registry.py", line 91, in populate
    app_config = AppConfig.create(entry)
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/apps/config.py", line 193, in create
    import_module(entry)
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1004, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'djaa_list_filter'
Traceback (most recent call last):
  File "/home/bond/cs590/web_agent_datasets/turking-bench/src/Turkle/manage.py", line 34, in <module>
    execute_from_command_line(sys.argv)
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/core/management/__init__.py", line 442, in execute_from_command_line
    utility.execute()
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/core/management/__init__.py", line 416, in execute
    django.setup()
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/__init__.py", line 24, in setup
    apps.populate(settings.INSTALLED_APPS)
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/apps/registry.py", line 91, in populate
    app_config = AppConfig.create(entry)
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/apps/config.py", line 193, in create
    import_module(entry)
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1004, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'djaa_list_filter'
/home/bond/cs590/web_agent_datasets/turking-bench/venv/bin/python3: can't open file 'manage.py': [Errno 2] No such file or directory
Exception in thread django-main-thread:
Traceback (most recent call last):
  File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/utils/autoreload.py", line 64, in wrapper
    fn(*args, **kwargs)
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/core/management/commands/runserver.py", line 125, in inner_run
    autoreload.raise_last_exception()
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/utils/autoreload.py", line 87, in raise_last_exception
    raise _exception[1]
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/core/management/__init__.py", line 394, in execute
    autoreload.check_errors(django.setup)()
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/utils/autoreload.py", line 64, in wrapper
    fn(*args, **kwargs)
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/__init__.py", line 24, in setup
    apps.populate(settings.INSTALLED_APPS)
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/apps/registry.py", line 91, in populate
    app_config = AppConfig.create(entry)
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/apps/config.py", line 193, in create
    import_module(entry)
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1004, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'djaa_list_filter'

[1]   Done                    ./1_run_website.sh
[2]-  Done                    ./1_run_website.sh

devinat1 commented 4 months ago

Here is the full log:

(venv) bond@razer:~/cs590/web_agent_datasets/turking-bench/src$  ./1_run_website.sh & sleep 30
[3] 13223
Using poetry and python3
Cloning into 'Turkle'...
remote: Enumerating objects: 5447, done.
remote: Counting objects: 100% (869/869), done.
remote: Compressing objects: 100% (383/383), done.
remote: Total 5447 (delta 554), reused 756 (delta 471), pack-reused 4578
Receiving objects: 100% (5447/5447), 8.40 MiB | 5.54 MiB/s, done.
Resolving deltas: 100% (3511/3511), done.
/home/bond/cs590/web_agent_datasets/turking-bench/venv/bin/python3: can't open file 'manage.py': [Errno 2] No such file or directory
Traceback (most recent call last):
  File "/home/bond/cs590/web_agent_datasets/turking-bench/src/Turkle/manage.py", line 34, in <module>
    execute_from_command_line(sys.argv)
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/core/management/__init__.py", line 442, in execute_from_command_line
    utility.execute()
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/core/management/__init__.py", line 416, in execute
    django.setup()
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/__init__.py", line 24, in setup
    apps.populate(settings.INSTALLED_APPS)
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/apps/registry.py", line 91, in populate
    app_config = AppConfig.create(entry)
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/apps/config.py", line 193, in create
    import_module(entry)
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1004, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'djaa_list_filter'
Traceback (most recent call last):
  File "/home/bond/cs590/web_agent_datasets/turking-bench/src/Turkle/manage.py", line 34, in <module>
    execute_from_command_line(sys.argv)
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/core/management/__init__.py", line 442, in execute_from_command_line
    utility.execute()
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/core/management/__init__.py", line 416, in execute
    django.setup()
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/__init__.py", line 24, in setup
    apps.populate(settings.INSTALLED_APPS)
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/apps/registry.py", line 91, in populate
    app_config = AppConfig.create(entry)
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/apps/config.py", line 193, in create
    import_module(entry)
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1004, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'djaa_list_filter'
/home/bond/cs590/web_agent_datasets/turking-bench/venv/bin/python3: can't open file 'manage.py': [Errno 2] No such file or directory
Exception in thread django-main-thread:
Traceback (most recent call last):
  File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/utils/autoreload.py", line 64, in wrapper
    fn(*args, **kwargs)
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/core/management/commands/runserver.py", line 125, in inner_run
    autoreload.raise_last_exception()
  File "/home/bond/cs590/web_agent_datasets/turking-bench/venv/lib/python3.10/site-packages/django/utils/autoreload.py", line 87, in raise_last_exception
    raise _exception[1]
(venv) bond@razer:~/cs590/web_agent_datasets/turking-bench/src$ ls
1_ia1_run_website.sh       4_run_evaluation.py  __init__.py         evaluate_model.py                   offline_baselines.py  utils
1_rockfish_run_website.sh  5_dump_features.py   amt                 evaluation                          run_single.py
1_run_website.sh           TAP_tests.py         cleanup.py          evaluation_class.py                 screenshots
2_generate_input_csv.py    TAP_tests_random.py  default_definition  img                                 test_ollama.py
3_upload_tasks.py          Turkle               dump_partition.py   lunch_test_tasks_on_amt_sandbox.py  tests.py
(venv) bond@razer:~/cs590/web_agent_datasets/turking-bench/src$ python3 2_generate_input_csv.py 
 ** Reading: ../tasks/Clarifiction/batch.csv
 ** Reading: ../tasks/Neurologic Recipe Eval/batch.csv
 ** Reading: ../tasks/Scalar Adjective Ordering/batch.csv
 ** Reading: ../tasks/Evaluate Questions Defeasibility/batch.csv
 ** Reading: ../tasks/Elicitation subj/batch.csv
 ** Reading: ../tasks/Question Gen/batch.csv
 ** Reading: ../tasks/Rewrite sentences (+ context)/batch.csv
 ** Reading: ../tasks/ATOMIC - NL Rephrase/batch.csv
 ** Reading: ../tasks/Ethics_sbic Dialogue/batch.csv
 ** Reading: ../tasks/Wiki103_Quality/batch.csv
 ** Reading: ../tasks/Essentian Terms in Questions/batch.csv
 ** Reading: ../tasks/Ethical rule-of-thumb quality/batch.csv
 ** Reading: ../tasks/ToTTo Evals/batch.csv
 ** Reading: ../tasks/ATOMIC - Event Sequences/batch.csv
 ** Reading: ../tasks/Annotate WaNLI/batch.csv
 ** Reading: ../tasks/ATOMIC Validate GL/batch.csv
 ** Reading: ../tasks/TuringAdvice/batch.csv
 ** Reading: ../tasks/Passive voice Parents 1st-2nd Person Persuasiveness Comparison/batch.csv
 ** Reading: ../tasks/Missing Adjective/batch.csv
 ** Reading: ../tasks/Triple Quality Eval/batch.csv
 ** Reading: ../tasks/Dialog Human Evaluation/batch.csv
 ** Reading: ../tasks/BiSECT Human Evaluation/batch.csv
 ** Reading: ../tasks/Story Eval [Quality]/batch.csv
 ** Reading: ../tasks/Spanish Word Alignment/batch.csv
 ** Reading: ../tasks/Visual Comet Multiple Choice Test Verify/batch.csv
 ** Reading: ../tasks/ATOMIC Neg Discriminator Eval/batch.csv
 ** Reading: ../tasks/Elicitation obj/batch.csv
 ** Reading: ../tasks/TrecQA/batch.csv
 ** Reading: ../tasks/DI Rationale Gen. evaluation/batch.csv
 ** Reading: ../tasks/PSTS-Lexsub/batch.csv
 ** Reading: ../tasks/Goal Distractor - ATOMIC base events 1/batch.csv
 ** Reading: ../tasks/IMDB Sentiment Completions/batch.csv
 ** Reading: ../tasks/Pseudoword Dataset Creation PPDB/batch.csv
 ** Reading: ../tasks/Rule-of-Thumb/batch.csv
 ** Reading: ../tasks/Opinion Mining of Spanish Customer Comments/batch.csv
 ** Reading: ../tasks/Event effect/batch.csv
 ** Reading: ../tasks/CommonsenseQA_Eval/batch.csv
 ** Reading: ../tasks/Generics Embodiment/batch.csv
 ** Reading: ../tasks/Style adaptation, pairwise, complex-simple/batch.csv
 ** Reading: ../tasks/Paraphrase Evaluation/batch.csv
 ** Reading: ../tasks/VQA Rationale Generation/batch.csv
 ** Reading: ../tasks/Automatic Detection of Generated Text/batch.csv
 ** Reading: ../tasks/Social Chem Eval/batch.csv
 ** Reading: ../tasks/Simplification-Meaning-Grammar-Simplicity/batch.csv
 ** Reading: ../tasks/Commonsense Morality - Verify Action Comprehensible/batch.csv
 ** Reading: ../tasks/ATOMIC - NL Rephrase Eval/batch.csv
 ** Reading: ../tasks/Rationale Generation/batch.csv
 ** Reading: ../tasks/Mauve human eval/batch.csv
 ** Reading: ../tasks/Typicality Judgements for Objects -disagreement run-/batch.csv
 ** Reading: ../tasks/Step 5 human performance/batch.csv
 ** Reading: ../tasks/Commonsense Morality - Class Label Collect/batch.csv
 ** Reading: ../tasks/Reddit In-group Analysis/batch.csv
 ** Reading: ../tasks/RocStories 1/batch.csv
 ** Reading: ../tasks/Dialogue rot annotation/batch.csv
 ** Reading: ../tasks/MinimalChangeInconsistentMiddles/batch.csv
 ** Reading: ../tasks/Style Adaptaion - Subjective-Objective/batch.csv
 ** Reading: ../tasks/NLI for GenGen/batch.csv
 ** Reading: ../tasks/ATOMIC - Object Rationale/batch.csv
 ** Reading: ../tasks/Scalar Adjectives Identification/batch.csv
 ** Reading: ../tasks/Mars human eval (a-b testing)/batch.csv
 ** Reading: ../tasks/Formalize sentence/batch.csv
 ** Reading: ../tasks/Recreation of the Dan Johnson/batch.csv
 ** Reading: ../tasks/Relative CommonsenseQA Explanation Pairwise Judgements Collection 3/batch.csv
 ** Reading: ../tasks/Chatbot Response Quality Evaluation/batch.csv
 ** Reading: ../tasks/O+S_Eval/batch.csv
 ** Reading: ../tasks/Evaluate Questions/batch.csv
 ** Reading: ../tasks/Generics Eval - Soft - all dataset eval/batch.csv
 ** Reading: ../tasks/Winogrande (object)/batch.csv
 ** Reading: ../tasks/neural-pop (PLAN evaluation) t5-human-test b/batch.csv
 ** Reading: ../tasks/Multimodal Prompt Caption Eval/batch.csv
 ** Reading: ../tasks/VisualCOMET Selection test/batch.csv
 ** Reading: ../tasks/Congressional Bills/batch.csv
 ** Reading: ../tasks/Full sentence style annotations/batch.csv
 ** Reading: ../tasks/Aductive Pairwise Eval/batch.csv
 ** Reading: ../tasks/PerSenT/batch.csv
 ** Reading: ../tasks/Compile list of area chairs/batch.csv
 ** Reading: ../tasks/JJ-NN HIT/batch.csv
 ** Reading: ../tasks/Generics Categories (Definitional)/batch.csv
 ** Reading: ../tasks/Lattice/batch.csv
 ** Reading: ../tasks/ANES 2008 open-ended survey/batch.csv
 ** Reading: ../tasks/Advice Gen/batch.csv
 ** Reading: ../tasks/Step 1 Generating Multi-Sentence Questions (science)/batch.csv
 ** Reading: ../tasks/COMET2020 ATOMIC Prag Neg Pilot/batch.csv
 ** Reading: ../tasks/Evaluate_e2e/batch.csv
 ** Reading: ../tasks/PSTS-Pivot/batch.csv
 ** Reading: ../tasks/Sentence Compression/batch.csv
 ** Reading: ../tasks/Commonsense Misinformation Tracking Pilot [cancer data setup]/batch.csv
 ** Reading: ../tasks/Polite versus rude/batch.csv
 ** Reading: ../tasks/Sentiment_Negative/batch.csv
 ** Reading: ../tasks/WikiHow Goal Membership/batch.csv
 ** Reading: ../tasks/Image captioning/batch.csv
 ** Reading: ../tasks/Human evaluation - gentle vs canary/batch.csv
 ** Reading: ../tasks/Science Questions test without any answers dropped/batch.csv
 ** Reading: ../tasks/Commonsense Morality - Text Label Validate-Collect/batch.csv
 ** Reading: ../tasks/Creating Sentence Paraphrases/batch.csv
 ** Reading: ../tasks/HTER/batch.csv
 ** Reading: ../tasks/Question Typing/batch.csv
 ** Reading: ../tasks/Spheres of Alturism/batch.csv
 ** Reading: ../tasks/WikiHow Step Membership/batch.csv
 ** Reading: ../tasks/BiSECT Multilingual Evaluation/batch.csv
 ** Reading: ../tasks/ESNLI Rationale Generation/batch.csv
 ** Reading: ../tasks/Step 4 Selecting good - bad answers to questions (short)/batch.csv
 ** Reading: ../tasks/HTER - longer sentences/batch.csv
 ** Reading: ../tasks/ATOMIC - Required Objects/batch.csv
 ** Reading: ../tasks/Newyorker_joke_explain_pairwise/batch.csv
 ** Reading: ../tasks/COMET2020 ATOMIC Inference Vp/batch.csv
 ** Reading: ../tasks/Essential Terms in Questions 2/batch.csv
 ** Reading: ../tasks/Evaluate the Quality of Explanations (Relative CommonsenseQA)/batch.csv
 ** Reading: ../tasks/Style adaptation, pairwise, subjective-objective/batch.csv
 ** Reading: ../tasks/Sherlock Four Sent Choice mixed/batch.csv
 ** Reading: ../tasks/Evaluate the Quality of Explanations (Relative NLI)/batch.csv
 ** Reading: ../tasks/Moral Judgement/batch.csv
 ** Reading: ../tasks/MCN - Multiple Choice testset/batch.csv
 ** Reading: ../tasks/Summarization/batch.csv
 ** Reading: ../tasks/Contextual Similarity -- Paraphrase Round/batch.csv
 ** Reading: ../tasks/Associate countries and languages with Ethnologue/batch.csv
 ** Reading: ../tasks/Generics Eval - Soft/batch.csv
 ** Reading: ../tasks/Commongen Evals (RLUE)/batch.csv
 ** Reading: ../tasks/Paraphrase Clustering with Merge/batch.csv
 ** Reading: ../tasks/Annotation subj_obj/batch.csv
 ** Reading: ../tasks/Annotation Rating/batch.csv
 ** Reading: ../tasks/PSTS-psts.c2v/batch.csv
 ** Reading: ../tasks/Text Game Eval/batch.csv
 ** Reading: ../tasks/Story Relations/batch.csv
 ** Reading: ../tasks/Reading Comprehension/batch.csv
 ** Reading: ../tasks/QuestionKeyPhrases/batch.csv
 ** Reading: ../tasks/DnD Identify Guidance/batch.csv
 ** Reading: ../tasks/Coherence Evaluation/batch.csv
 ** Reading: ../tasks/ATOMIC KB Completion/batch.csv
 ** Reading: ../tasks/Gun violence structured extraction/batch.csv
 ** Reading: ../tasks/ANLI Generation/batch.csv
 ** Reading: ../tasks/What breaks the flow - no categories/batch.csv
 ** Reading: ../tasks/InconsistentMiddles/batch.csv
 ** Reading: ../tasks/Generics Eval - Comparatives/batch.csv
 ** Reading: ../tasks/Evaluate the Quality of Explanations (Singular Acceptability NLI)/batch.csv
 ** Reading: ../tasks/Step 1 Generating Multi-Sentence Questions (CNN)/batch.csv
 ** Reading: ../tasks/Sentence Summarization/batch.csv
 ** Reading: ../tasks/NER - Scruples/batch.csv
 ** Reading: ../tasks/Winogrande plausiblity/batch.csv
 ** Reading: ../tasks/Photo Collection GVDB/batch.csv
 ** Reading: ../tasks/Script KD eval/batch.csv
 ** Reading: ../tasks/Opinion Mining of Spanish Customer Comments2/batch.csv
 ** Reading: ../tasks/Duration Range Evaluation 10_07/batch.csv
 ** Reading: ../tasks/Number of sense/batch.csv
 ** Reading: ../tasks/Commonsense Morality - Validate Instances/batch.csv
 ** Reading: ../tasks/Radiology Sentence Classification/batch.csv
 ** Reading: ../tasks/ATOMIC - Required Objects (Sequence)/batch.csv
 ** Reading: ../tasks/Evaluate free-text rationales/batch.csv
 ** Reading: ../tasks/Defeasible inference/batch.csv
 ** Reading: ../tasks/Detox/batch.csv
 ** Reading: ../tasks/Commonsense Morality-Text Label Validate-Collect-Extended/batch.csv
 ** Reading: ../tasks/Word Formality Annotation/batch.csv
 ** Reading: ../tasks/Abductive Reasoning/batch.csv
 ** Reading: ../tasks/Newyorker_joke_explain_quality/batch.csv
 ** Reading: ../tasks/Intuitive physics/batch.csv
(venv) bond@razer:~/cs590/web_agent_datasets/turking-bench/src$ python3 3_upload_tasks.py 
 -> Clarifiction
Error: failed to contact site
 -> Neurologic Recipe Eval
Error: failed to contact site
 -> Scalar Adjective Ordering
Error: failed to contact site
 -> Evaluate Questions Defeasibility
Error: failed to contact site
 -> Elicitation subj
Error: failed to contact site
 -> Question Gen
Error: failed to contact site
 -> Rewrite sentences (+ context)
Error: failed to contact site
 -> ATOMIC - NL Rephrase
Error: failed to contact site
 -> Ethics_sbic Dialogue
Error: failed to contact site
 -> Wiki103_Quality
Error: failed to contact site
 -> Essentian Terms in Questions
Error: failed to contact site
 -> Ethical rule-of-thumb quality
Error: failed to contact site
 -> ToTTo Evals
Error: failed to contact site
 -> ATOMIC - Event Sequences
Error: failed to contact site
 -> Annotate WaNLI
Error: failed to contact site
 -> ATOMIC Validate GL
Error: failed to contact site
 -> TuringAdvice
Error: failed to contact site
 -> Passive voice Parents 1st-2nd Person Persuasiveness Comparison
Error: failed to contact site
 -> Missing Adjective
Error: failed to contact site
 -> Triple Quality Eval
Error: failed to contact site
 -> Dialog Human Evaluation
Error: failed to contact site
 -> BiSECT Human Evaluation
Error: failed to contact site
 -> Story Eval [Quality]
Error: failed to contact site
 -> Spanish Word Alignment
Error: failed to contact site
 -> Visual Comet Multiple Choice Test Verify
Error: failed to contact site
 -> ATOMIC Neg Discriminator Eval
Error: failed to contact site
 -> Elicitation obj
Error: failed to contact site
 -> TrecQA
Error: failed to contact site
 -> DI Rationale Gen. evaluation
Error: failed to contact site
 -> PSTS-Lexsub
Error: failed to contact site
 -> Goal Distractor - ATOMIC base events 1
Error: failed to contact site
 -> IMDB Sentiment Completions
Error: failed to contact site
 -> Pseudoword Dataset Creation PPDB
Error: failed to contact site
 -> Rule-of-Thumb
Error: failed to contact site
 -> Opinion Mining of Spanish Customer Comments
Error: failed to contact site
 -> Event effect
Error: failed to contact site
 -> CommonsenseQA_Eval
Error: failed to contact site
 -> Generics Embodiment
Error: failed to contact site
 -> Style adaptation, pairwise, complex-simple
Error: failed to contact site
 -> Paraphrase Evaluation
Error: failed to contact site
 -> VQA Rationale Generation
Error: failed to contact site
 -> Automatic Detection of Generated Text
Error: failed to contact site
 -> Social Chem Eval
Error: failed to contact site
 -> Simplification-Meaning-Grammar-Simplicity
Error: failed to contact site
 -> Commonsense Morality - Verify Action Comprehensible
Error: failed to contact site
 -> ATOMIC - NL Rephrase Eval
Error: failed to contact site
 -> Rationale Generation
Error: failed to contact site
 -> Mauve human eval
Error: failed to contact site
 -> Typicality Judgements for Objects -disagreement run-
Error: failed to contact site
 -> Step 5 human performance
Error: failed to contact site
 -> Commonsense Morality - Class Label Collect
Error: failed to contact site
 -> Reddit In-group Analysis
Error: failed to contact site
 -> RocStories 1
Error: failed to contact site
 -> Dialogue rot annotation
Error: failed to contact site
 -> MinimalChangeInconsistentMiddles
Error: failed to contact site
 -> Style Adaptaion - Subjective-Objective
Error: failed to contact site
 -> NLI for GenGen
Error: failed to contact site
 -> ATOMIC - Object Rationale
Error: failed to contact site
 -> Scalar Adjectives Identification
Error: failed to contact site
 -> Mars human eval (a-b testing)
Error: failed to contact site
 -> Formalize sentence
Error: failed to contact site
 -> Recreation of the Dan Johnson
Error: failed to contact site
 -> Relative CommonsenseQA Explanation Pairwise Judgements Collection 3
Error: failed to contact site
 -> Chatbot Response Quality Evaluation
Error: failed to contact site
 -> O+S_Eval
Error: failed to contact site
 -> Evaluate Questions
Error: failed to contact site
 -> Generics Eval - Soft - all dataset eval
Error: failed to contact site
 -> Winogrande (object)
Error: failed to contact site
 -> neural-pop (PLAN evaluation) t5-human-test b
Error: failed to contact site
 -> Multimodal Prompt Caption Eval
Error: failed to contact site
 -> VisualCOMET Selection test
Error: failed to contact site
 -> Congressional Bills
Error: failed to contact site
 -> Full sentence style annotations
Error: failed to contact site
 -> Aductive Pairwise Eval
Error: failed to contact site
 -> PerSenT
Error: failed to contact site
 -> Compile list of area chairs
Error: failed to contact site
 -> JJ-NN HIT
Error: failed to contact site
 -> Generics Categories (Definitional)
Error: failed to contact site
 -> Lattice
Error: failed to contact site
 -> ANES 2008 open-ended survey
Error: failed to contact site
 -> Advice Gen
Error: failed to contact site
 -> Step 1 Generating Multi-Sentence Questions (science)
Error: failed to contact site
 -> COMET2020 ATOMIC Prag Neg Pilot
Error: failed to contact site
 -> Evaluate_e2e
Error: failed to contact site
 -> PSTS-Pivot
Error: failed to contact site
 -> Sentence Compression
Error: failed to contact site
 -> Commonsense Misinformation Tracking Pilot [cancer data setup]
Error: failed to contact site
 -> Polite versus rude
Error: failed to contact site
 -> Sentiment_Negative
Error: failed to contact site
 -> WikiHow Goal Membership
Error: failed to contact site
 -> Image captioning
Error: failed to contact site
 -> Human evaluation - gentle vs canary
Error: failed to contact site
 -> Science Questions test without any answers dropped
Error: failed to contact site
 -> Commonsense Morality - Text Label Validate-Collect
Error: failed to contact site
 -> Creating Sentence Paraphrases
Error: failed to contact site
 -> HTER
Error: failed to contact site
 -> Question Typing
Error: failed to contact site
 -> Spheres of Alturism
Error: failed to contact site
 -> WikiHow Step Membership
Error: failed to contact site
 -> BiSECT Multilingual Evaluation
Error: failed to contact site
 -> ESNLI Rationale Generation
Error: failed to contact site
 -> Step 4 Selecting good - bad answers to questions (short)
Error: failed to contact site
 -> HTER - longer sentences
Error: failed to contact site
 -> ATOMIC - Required Objects
Error: failed to contact site
 -> Newyorker_joke_explain_pairwise
Error: failed to contact site
 -> COMET2020 ATOMIC Inference Vp
Error: failed to contact site
 -> Essential Terms in Questions 2
Error: failed to contact site
 -> Evaluate the Quality of Explanations (Relative CommonsenseQA)
Error: failed to contact site
 -> Style adaptation, pairwise, subjective-objective
Error: failed to contact site
 -> Sherlock Four Sent Choice mixed
Error: failed to contact site
 -> Evaluate the Quality of Explanations (Relative NLI)
Error: failed to contact site
 -> Moral Judgement
Error: failed to contact site
 -> MCN - Multiple Choice testset
Error: failed to contact site
 -> Summarization
Error: failed to contact site
 -> Contextual Similarity -- Paraphrase Round
Error: failed to contact site
 -> Associate countries and languages with Ethnologue
Error: failed to contact site
 -> Generics Eval - Soft
Error: failed to contact site
 -> Commongen Evals (RLUE)
Error: failed to contact site
 -> Paraphrase Clustering with Merge
Error: failed to contact site
 -> Annotation subj_obj
Error: failed to contact site
 -> Annotation Rating
Error: failed to contact site
 -> PSTS-psts.c2v
Error: failed to contact site
 -> Text Game Eval
Error: failed to contact site
 -> Story Relations
Error: failed to contact site
 -> Reading Comprehension
Error: failed to contact site
 -> QuestionKeyPhrases
Error: failed to contact site
 -> DnD Identify Guidance
Error: failed to contact site
 -> Coherence Evaluation
Error: failed to contact site
 -> ATOMIC KB Completion
Error: failed to contact site
 -> Gun violence structured extraction
Error: failed to contact site
 -> ANLI Generation
Error: failed to contact site
 -> What breaks the flow - no categories
Error: failed to contact site
 -> InconsistentMiddles
Error: failed to contact site
 -> Generics Eval - Comparatives
Error: failed to contact site
 -> Evaluate the Quality of Explanations (Singular Acceptability NLI)
Error: failed to contact site
 -> Step 1 Generating Multi-Sentence Questions (CNN)
Error: failed to contact site
 -> Sentence Summarization
Error: failed to contact site
 -> NER - Scruples
Error: failed to contact site
 -> Winogrande plausiblity
Error: failed to contact site
 -> Photo Collection GVDB
Error: failed to contact site
 -> Script KD eval
Error: failed to contact site
 -> Opinion Mining of Spanish Customer Comments2
Error: failed to contact site
 -> Duration Range Evaluation 10_07
Error: failed to contact site
 -> Number of sense
Error: failed to contact site
 -> Commonsense Morality - Validate Instances
Error: failed to contact site
 -> Radiology Sentence Classification
Error: failed to contact site
 -> ATOMIC - Required Objects (Sequence)
Error: failed to contact site
 -> Evaluate free-text rationales
Error: failed to contact site
 -> Defeasible inference
Error: failed to contact site
 -> Detox
Error: failed to contact site
 -> Commonsense Morality-Text Label Validate-Collect-Extended
Error: failed to contact site
 -> Word Formality Annotation
Error: failed to contact site
 -> Abductive Reasoning
Error: failed to contact site
 -> Newyorker_joke_explain_quality
Error: failed to contact site
 -> Intuitive physics
Error: failed to contact site

devinat1 commented 4 months ago

Were you all able to run the entire script locally (and not using Github Actions)?

devinat1 commented 4 months ago

I was able to run successfully ./1_run_website.sh after running poetry install.

JHU-CLSP / turking-bench

requirements.txt does not include packages needed to run ./1_run_website.sh #131

Runs a set of commands using the runners shell