mit-ll / spacegym-kspdg

Non-cooperative satellite operations challenge problems implemented in the Kerbal Space Program game engine
MIT License
48 stars 11 forks source link

Output error of Example: Agent-Environment Evaluator for SciTech Challenge #18

Open henrybb0826 opened 2 weeks ago

henrybb0826 commented 2 weeks ago

Hello, I am running Example: Agent-Environment Evaluator for SciTech Challenge. Currently I have finished running python evaluate.py configs/example_eval_cfg.yaml. But the screen will be stuck in ~Closing KSPDG environment~ and the txt file will not be output. Wondering if there is something wrong with the settings? closing env

rallen10 commented 2 weeks ago

@henrybb0826 Thank you for creating this issue. Can you start by telling me what operating system you are using (Windows, Mac, Linux) and what version of python (3.9, 3.12)?

rallen10 commented 2 weeks ago

@henrybb0826 Two other questions

rallen10 commented 1 week ago

I have a suspicion that this is caused by some underlying problem with the julia dependencies used in the LG3 environments. Here are a few debugging steps that may help (or may help me help you)

python scripts/example_private_src_env_runner.py 2>&1 | tee ~/Desktop/output.txt
strsix commented 2 days ago

We've faced a similar problem where the episode never ends during the evaluation, plus julia dependency doesn't install, and serverless tests cause segfault error.

Here are some workarounds that worked for us, hopefully it helps others who are facing such errors.

  1. conda remove --name kspdg --all
  2. Set the "environment.yml" to default (if you added or changed anything there). Then, edit python version from 3 to 3.9 in "environment.yml"
  3. pip install poliastro (if you are using it within your algorithm)
  4. python install_julia_deps.py
  5. pytest tests/serverless_tests/

So the step 1 and 2 made things work.

rallen10 commented 1 day ago

@strsix: I don't have a good explanation for why you were getting the sefault and why your workaround fixed it, but my best guess is that it has something to do with the installation order of dependencies. That is to say that additional dependencies for agent development (e.g. poliastro in your case) needed to be install after kspdg's own dependencies (found in pyproject.toml and install_julia_deps.py).

Also, it does not seem to be strictly necessary, but I strongly recommend using juliaup to install julia on your machine before installing kspdg's environment. This helps manage different julia version similar to how conda helps manage different python versions on one computer.

Therefore, the installation process for developing new agents to solve kspdg's challenge problems might look roughly like

# install juliaup on MacOS, for further instructions: https://github.com/JuliaLang/juliaup#installation
curl -fsSL https://install.julialang.org | sh

# create the kspdg conda environment and then clone it to repurpose it for agent development
conda env create -f environment.yml   # this creates the kspdg environment
conda create --name kspdg_agents --clone kspdg   # this creates a new env called kspdg_agents which starts as a copy of kspdg env
conda remove --name kspdg --all    # optionally delete the original kspdg env if you don't plan to use it

# install additional julia dependencies for the "adv_bots" kspdg environments (e.g. LBG1_LG3)
conda activate kspdg_agents
python install_julia_deps.py

# NOW install any additional dependencies you want to use for your kspdg agents within the kspdg_agent conda env
pip install <other-dependencies-you-want>

I haven't fully tested this process, but I will update the README install instructions once I have a chance to vet it.