run_replay.py cannot deal with github urls and gets stuck in infinite loop

klieret commented 6 months ago

python run_replay.py --traj_path=trajectories/fuchur/gpt4__klieret__swe-agent-test-repo__default_from_url__t-0.20__p-0.95__c-2.00__install-1/klieret__swe-agent-test-repo-i1.traj --data_path=trajectories/fuchur/gpt4__klieret__swe-agent-test-repo__default_from_url__t-0.20__p-0.95__c-2.00__install-1/all_preds.jsonl --config_file=config/default_from_url.yaml

INFO     💽 Loaded dataset from klieret__swe-agent-test-repo-i1.jsonl
INFO     🌱 Environment Initialized
INFO     ▶️  Beginning task 0
Traceback (most recent call last):
  File "/Users/fuchur/Documents/24/git_sync/SWE-agent/run.py", line 85, in main
    observation, info = env.reset(index)
  File "/Users/fuchur/Documents/24/git_sync/SWE-agent/sweagent/environment/swe_env.py", line 135, in reset
    self.base_commit = self.record["base_commit"]
KeyError: 'base_commit'
WARNING  ❌ Failed on klieret__swe-agent-test-repo-i1: 'base_commit'
INFO     Beginning environment shutdown...
INFO     Agent container stopped
INFO     🌱 Environment Initialized
INFO     ▶️  Beginning task 1
Traceback (most recent call last):
  File "/Users/fuchur/Documents/24/git_sync/SWE-agent/run.py", line 85, in main
    observation, info = env.reset(index)
  File "/Users/fuchur/Documents/24/git_sync/SWE-agent/sweagent/environment/swe_env.py", line 135, in reset
    self.base_commit = self.record["base_commit"]
KeyError: 'base_commit'
WARNING  ❌ Failed on klieret__swe-agent-test-repo-i1: 'base_commit'
INFO     Beginning environment shutdown...
INFO     Agent container stopped
INFO     🌱 Environment Initialized
INFO     ▶️  Beginning task 2
Traceback (most recent call last):
  File "/Users/fuchur/Documents/24/git_sync/SWE-agent/run.py", line 85, in main
    observation, info = env.reset(index)
  File "/Users/fuchur/Documents/24/git_sync/SWE-agent/sweagent/environment/swe_env.py", line 135, in reset
    self.base_commit = self.record["base_commit"]
KeyError: 'base_commit'

ofirpress commented 6 months ago

I think this is a deprecated feature related to something we explored that didn't end up making it into the final model. @john-b-yang will know for sure.

klieret commented 6 months ago

Being able to do replays is quite nice for testing purposes (like testing the "open PR" mechanism etc.). If it's not much used right now, I'll just merge the fix for now and we can remove the whole thing later if we choose so.

ofirpress commented 6 months ago

ah ok if you found a use for this we should keep it! thanks

john-b-yang commented 6 months ago

I just cleaned up this file a bit in db344162b1d243eff87fd5fb291313839ba99124 (mainly just removed process_synthetic_trajs function, --action_trajs_path argument).

This a helpful file. The main use case is if a user is interested in creating a demonstration for a new configuration from an existing set of actions (reflected by the --traj_path argument). Instead of having to create it manually with python run.py --model human ..., one can just run

python run_replay.py \
        --traj_path trajectories/carlosejimenez/<experiment>/pallets_flask-123.traj \
        --config_file config/<new config file name>.yaml \
        --data_path path/to/file/with/pallets_flask-123-task-instance.jsonl # Optional, automatically inferred from experiment folder if available \
        --suffix run1 # Optional \

princeton-nlp / SWE-agent

run_replay.py cannot deal with github urls and gets stuck in infinite loop #47