facebookresearch / Mephisto

A suite of tools for managing crowdsourcing tasks from the inception through to data packaging for research use.
https://mephisto.ai/
MIT License
303 stars 76 forks source link

[Error] the runs in Mturk can not be completed #771

Closed acha21 closed 2 years ago

acha21 commented 2 years ago

Hello, I am Yeonchan Ahn in South Korea. Thank you for sharing a great framework to use Mturk.

The day before yesterday(4/26), I have conducted a static react-based survey using Mephisto v.1.0.1 on Mturk with Heroku hobby. During about 1~2 hours I confirmed that the data is being collected, but after some time it didn't work properly. In order to check out whether the problem repeats or not, yesterday(4/27) I re-runed the same script with a small number (216) of units and got the same phenomenon (after I collected 171 results and failed to collect the results of the rest). The following config is what I used yesterday.

#@package _global_
defaults:
  - /mephisto/blueprint: model_eval_static_blueprint
  - /mephisto/architect: heroku
  - /mephisto/provider: mturk
mephisto:
  blueprint:
    task_source: ${task_dir}/webapp/build/bundle.js
    link_task_source: false
    extra_source_dir: ${task_dir}/webapp/src/static
    units_per_assignment: 3
    onboarding_qualification: kgc-grounding-p2
    data_jsonl: ${task_dir}/data/processed/test/a=bi_b=b_added_eval.jsonl
    onboarding_data: ${task_dir}/task_config/onboarding.json
  task:
    task_name: kgc-conv-main-added
    task_title: "Test #2"
    task_description: "In this task, you'll be given a conversational context and multiple responses. It is your job to rate the responses."
    task_reward: 0.25
    task_tags: "conversation,dialog,button,ai,evaluation"
    maximum_units_per_worker: 50
    max_num_concurrent_units: 50

I am frustrated that I can't even figure out which part causes the problem. So I upload the whole log that I failed yesterday. scripts.log

Here is another piece of information that may useful. Currently, I am doing a survey for evaluating my AI systems using Mephisto v. 1.0.1 (a216d2d6ba739aadde2cacaa906dad5e78d6dc2f). When I implement the UI for the survey from the example copied from examples/static_react_task. In the UI, I have implemented an Onboarding example but used it for just a demo which means all of the workers who submitted any answer can participate in our main survey.

JackUrb commented 2 years ago

Hi @acha21, thanks for opening this issue and providing so much context. I'm going to take a try at digging through the logs and debugging it tomorrow, though I want to note you may have better luck isolating the problem on your side using the mephisto metrics tooling in the meantime.

JackUrb commented 2 years ago

Jumping into the logs, I recently pushed a fix into main that should resolve the bugs related to handle_updated_agent_status. The ones in the log around sqlite3.IntegrityError: UNIQUE constraint failed: workers.worker_name are much stranger.

The following section should only try to create an entry in the database if it doesn't already exist: https://github.com/facebookresearch/Mephisto/blob/main/mephisto/operations/worker_pool.py#L152-L165

I'm not sure I know how to reproduce this second part given the above.

That being said, we're observing a strange slowdown issue in Mephisto on the current main branch (also reported by @Alex-Gurung) where the collection rate decreases towards zero as more data is collected. While we expect some decrease of this sort, it should never actually reach zero. As such we'll be investigating this next week.

acha21 commented 2 years ago

Thank you for your quick response.

Since I am now pressed for the time, I need a quick workaround for this issue. Do you think that the error Is it the log sqlite3.IntegrityError: UNIQUE constraint failed: workers.worker_name is related to the issue where the collection rate decreases towards zero as more data is collected? If not, I am thinking about splitting the whole RunTask into small independent pieces as a workaround, but the method cannot limit the number of maximum_units_per_worker.

Is there any suggestion for me?

JackUrb commented 2 years ago

Launching on multiple runs won't help out (and would likely be worse for the worker name issue). If you're on a tight deadline I'd suggest relaunching periodically - relaunching every few hours.

JackUrb commented 2 years ago

Hi @acha21, this should now be fixed in #770. I'll be moving to merge it later this week, but feel free to try the branch out sooner. Let me know if your issues are resolved afterwards!

JackUrb commented 2 years ago

Closing as fixed in our most recent release (1.0.3)