renatolfc / sched-rl-gym

MIT License
13 stars 1 forks source link

reproducibility issues on Windows #1

Closed aceyang0114 closed 2 years ago

aceyang0114 commented 2 years ago

Hi, I got some problem during reproduce. The parallelworkloads module can't work.

Error message :
Traceback (most recent call last):
  File "C:/Users/USER/Desktop/sched-rl-gym/main.py", line 5, in <module>
    import schedgym.envs as deeprm
  File "C:\Users\USER\Desktop\sched-rl-gym\schedgym\envs\__init__.py", line 7, in <module>
    from .deeprm_env import DeepRmEnv
  File "C:\Users\USER\Desktop\sched-rl-gym\schedgym\envs\deeprm_env.py", line 13, in <module>
    from .base import BaseRmEnv
  File "C:\Users\USER\Desktop\sched-rl-gym\schedgym\envs\base.py", line 13, in <module>
    from .simulator import SimulationType, DeepRmSimulator
  File "C:\Users\USER\Desktop\sched-rl-gym\schedgym\envs\simulator.py", line 9, in <module>
    from schedgym.envs.workload import (
  File "C:\Users\USER\Desktop\sched-rl-gym\schedgym\envs\workload.py", line 12, in <module>
    from parallelworkloads.lublin99 import Lublin99
ModuleNotFoundError: No module named 'parallelworkloads.lublin99'

Process finished with exit code 1

There're only have init.py file when I check parallelworkloads module file Then I tried using github install module, but still can't work. May you help me?

renatolfc commented 2 years ago

Hi.

First of all, did you try installing sched-rl-gym, or did you just try importing it from the GitHub repo?

You should use pip install . at the root of the repo so that dependencies are downloaded automatically.

You will have better luck installing this on a Linux system (or WSL2) system, as the C/C++ code in the parallelworkloads package won't work on Windows. For Linux systems, there are prebuilt packages that pip should be able to download.

aceyang0114 commented 2 years ago

Hi, thanks for help. I tried execing colab tutorial, and I got some problem. When I exec import function then show the error message

AttributeError: module 'schedgym.envs.deeprm_env' has no attribute 'DEFAULT_WORKLOAD'

DEFAULT_WORKLOAD doesn't in deep_env.py, so I used likely struct in compact_env.py, but doesn't work.

Could you help me solve the problem?

renatolfc commented 2 years ago

Thanks for catching the error with the notebook tutorial. Some classes got moved around in my last refactor, and I had forgotten to update the tutorial.

I've just pushed a couple of changes that fix the issue you were having. I just tested the code in a new colab runtime and the learning function is running without issues.

If you happen to have saved a copy of the notebook, what you can do is change the notebook cell that defines the workload to read:

import schedgym.envs.base as base

SLOTS: int = 10
BACKLOG: int = 60
TIME_LIMIT: int = 50
TIME_HORIZON: int = 20

workload = base.DEFAULT_WORKLOAD
workload['new_job_rate'] = .3

Notice that you need to delete the sched-rl-gym cache in your colab instance before re-running the notebook.

aceyang0114 commented 2 years ago

Sorry, I got problem again... I tried execing

rewards, slowdowns, returns, means = evaluate( PPO, 'DeepRM-v0', 'ppo_deeprm-{}', 60 ) Then

AttributeError                            Traceback (most recent call last)
[<ipython-input-18-e5fc086f7e6e>](https://localhost:8080/#) in <module>()
      1 rewards, slowdowns, returns, means = evaluate(
----> 2     PPO, 'DeepRM-v0', 'ppo_deeprm-{}', 60
      3 )

2 frames
[<ipython-input-15-9d1a9d77913f>](https://localhost:8080/#) in evaluate(model_class, env, template, runs, total_epochs)
      9     reward, slowdown, ret, mean = zip(
     10       *[run_baselines_model((env, model, j, False, workload))
---> 11         for j in range(runs)]
     12     )
     13     rewards.append(reward)

[<ipython-input-15-9d1a9d77913f>](https://localhost:8080/#) in <listcomp>(.0)
      9     reward, slowdown, ret, mean = zip(
     10       *[run_baselines_model((env, model, j, False, workload))
---> 11         for j in range(runs)]
     12     )
     13     rewards.append(reward)

[<ipython-input-14-fef81e53cd8b>](https://localhost:8080/#) in run_baselines_model(args)
     12         if done:
     13             break
---> 14     return np.sum(rewards), np.mean(env.slowdown), discount(rewards, 0.99)[0],           np.mean(rewards)

AttributeError: 'DeepRmEnv' object has no attribute 'slowdown'

Could you help me solve the problem?

renatolfc commented 2 years ago

Sure. I was just debugging that. Turns out, the slowdown attribute got moved around as well.

Anyway, I've just pushed a potential fix.

That cell should read something like the example below:

def run_baselines_model(args):
    env, model, i, raw, workload = args
    env = setup_environment(env, workload=workload)
    env.use_raw_state = raw
    np.random.seed(i)
    rewards = []
    slowdowns = []
    state = env.reset()
    for _ in range(200):
        action, _ = model.predict(state, deterministic=True)
        state, reward, done, info = env.step(action)
        rewards.append(reward)
        if done:
            break
    mean_slowdown = np.mean([
          1 + 1 / j.execution_time * (
              (j.start_time - j.submission_time)
              if j.start_time != -1
              else (env.scheduler.current_time - j.submission_time)
          )
          for j in env.scheduler.all_jobs
    ])
    return np.sum(rewards), mean_slowdown, discount(rewards, 0.99)[0],\
           np.mean(rewards)
aceyang0114 commented 2 years ago

Thanks alot!