Testing New Model Performance

DroneMesh commented 4 years ago

Hi Wil,

I have setup your latest gymfc with the nf1 example all is working, However previously we were able to run pi.act(False, ob=ob)[0] to test the model and do our own custom graphing but with your new baselines repo something has changed can you clarify how to run the recent model manually to test it.

Thanks

wil3 commented 4 years ago

Hey @DroneMesh, Could you provide me with some more information? Are you saying training with PPO1 from here isn't working anymore or just evaluation? Have you been able to generate the checkpoints with this script? What's actually not working, please provide all commands executed and their output.

DroneMesh commented 4 years ago

HI Wil,

Using the PPO1 Example all is working well and checkpoints are being generated correctly. However, I am unable to test the model and graph its performance.

def train(env, num_timesteps, seed, flight_log_dir=None, ckpt_dir=None, 
          render=False, ckpt_freq=0, restore_dir=None, optim_stepsize=3e-4, 
          schedule="linear", gamma=0.99, optim_epochs=10, optim_batchsize=64, 
          horizon=2048):
 ........
    pi = pposgd_simple.learn(env, policy_fn,
            max_timesteps=num_timesteps,
            timesteps_per_actorbatch=horizon,
            clip_param=0.2, entcoeff=0.0,
            optim_epochs=optim_epochs, optim_stepsize=optim_stepsize, 
            optim_batchsize=optim_batchsize,
            gamma=0.99, lam=0.95, schedule=schedule,
            flight_log = flight_log,
            ckpt_dir = ckpt_dir,
            restore_dir = restore_dir,
            save_timestep_period= ckpt_freq
            )
    env.close()

    return pi

pi = train(num_timesteps=1, seed=args.seed, env_id=args.env)

        actuals = []
        desireds = []
        while True:
            desired = env.true_error
            actual = env.measured_error
            actuals.append(actual)
            desireds.append(desired)
            print("sp=", desired, " rate=", actual)
            action = pi.act(stochastic=False, ob=ob)[0] ### THIS IS NO LONGER WORKING
            ob, _, done, _ = env.step(action)
            if done:
                break
        plot_step_response(np.array(desireds), np.array(actuals))

The problem with this now is the pi.act function no longer working on this version of baselines. Do you know what has changed and if you have a premade script for testing your models performance that would help as well. I can do different variants of it and setup PR request for the current repo.

DroneMesh commented 4 years ago

Hi Wil,

You can close this I ended up using the latest Stable-Baselines and modified the NF1 example slightly. Once I finish and clean up the code I will issue a pull request. For a new NF1 example that is compatible with the latest Stable-Baselines along with the script to test its performance where it graphs motor output and target setpoints with the error.

wil3 commented 4 years ago

This project has only provided code and examples for OpenAI Baselines so if you are using Stable Baselines that's probably why it didn't work. Be sure to thoroughly read https://github.com/wil3/gymfc/blob/master/CONTRIBUTING.md before opening a PR. PRs are tied to issues so its up to you when you want to close an issue you open.

DroneMesh commented 4 years ago

I was using your openAI Baselines branch but then i gave up on it and now i am using stable-baselines is actually working really well with a ton more features for debugging and custom callbacks.

https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=icon Virus-free. www.avast.com https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail&utm_term=link <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

On Mon, Jun 1, 2020 at 7:54 PM Wil Koch notifications@github.com wrote:

This project has only provided code and examples for OpenAI Baselines so if you are using Stable Baselines that's probably why it didn't work. Be sure to thoroughly read https://github.com/wil3/gymfc/blob/master/CONTRIBUTING.md before opening a PR. PRs are tied to issues so its up to you when you want to close an issue you open.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/wil3/gymfc/issues/62#issuecomment-637013781, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHYGCF6AZGAAMMZ3O52L2ETRUPTMDANCNFSM4NOEM4GA .

wil3 commented 4 years ago

Hi @DroneMesh since this opened issue was due to code that is supported in the repo I'm going to close it since the PR you briefly mentioned is a separate issue. If you are still interested in contributing and submitting a PR please open a feature request issue outlining the intended changes for that specific PR.

To add a policy for an other trainer you'll just need a new policy here and an example of how that trainer is instantiated like the OpenAI Baselines example. In fact the baselinespolicy.py can really be generalized to a Tensorflow checkpoint policy which is inheritted depending on the different tensor names for the input and output. I have a Tensorforce policy I plan to add soon too.

I've also added evaluation scripts and plotters in the examples directory that may help you.

wil3 / gymfc

Testing New Model Performance #62