Closed nikolajurkovic closed 1 month ago
Confirmed that this is also happening for ID 2 (ai_rd_nanogpt_chat_rl/main@feature/viv-intermediate-scoring) so it's probably a problem with most/all tasks.
It's possible that you're logging in before the task is fully set up. Many AI R&D tasks have fairly length start times (5-10 minutes). Also, whether it's the first or second time for a task is irrelevant, that's a red herring.
If you see this again, please check out /agent_output
for any log files and include them here. I pushed some changes yesterday that might have broken things.
Sami, I think you're right re: logging in before the task is set up.
I've modified the baseline-ops prep script to wait until TaskFamily.start() completes before doing the rest of the setup, which I think will fix quite a few things.
Running a human run for a task for the second time appears to break the agent helper files and commands.
When I ran the task
ai_rd_optimize_llm_foundry/main@feature/viv-intermediate-scoring
(ID 3 in the airtable) for the first time, I successfully connected to the machine and saw the welcome message, and used the helper commands like mnote, mclock, etc.However, when I set "Send to Human Run Maker" field to blank and back, and used the provided ssh key and command to connect to the machine, there were some problems:
-bash: mclock: command not found