aikkala / user-in-the-box

GNU General Public License v3.0
32 stars 8 forks source link

Updated step and reset function to match new gym convention (gym>=0.26); added rendering function and new render_mode "rgb_array_list" to simulator instances #11

Closed fl0fischer closed 1 year ago

fl0fischer commented 2 years ago

Did you test that the training and evaluator works?

Both training and evaluation seem to work, however, this requires a currently unpublished version of stable-baselines3 (https://github.com/carlosluis/stable-baselines3/tree/fix_tests). I would thus suggest to continue working on this branch and merge with main branch as soon as stable-baselines3 officially supports the new gym version.

Since we're not copying the evaluator.py file into the generated simulators (should we?)

I would rather go for copying the evaluator.py file to the simulator as well, as simulators should be considered as standalone modules, right? For reasons of consistency, I think we should thus copy all relevant files (including all test and evaluation scripts) when building the generator, to avoid version mismatches. On the other hand, I really like your suggested versioning approach and will try to implement that (including the id->version renaming). Maybe we can also add an option which allows a simulator to import relevant scripts (e.g., evaluator.py) from the uitb class itself rather than from the copied simulator class files [UPDATE: just noticed that this already exists ^^], in case someone wants to evaluate a policy trained on an older version using the most recent evaluation scripts (maybe with a warning raised, if first or second version digit has changed)?

We maybe should create an optional "HumanViewable" wrapper class / parent class for vision modules, so that we can standardize what should be returned when rendering (see line 410 in simulator.py)

Yeah, I will create a draft on that.

fl0fischer commented 2 years ago

For "HumanViewable" rendering, I suggest the following structure (see last commits):

How do you think about this rendering pipeline?

aikkala commented 2 years ago

Thanks Florian, this is pretty impressive. Two notes:

1) Could you change the Simulator.version to only contain the numbers in format "x.y.z", so instead of version="uitb:simulator-v1.1.0" we'd have version="1.1.0" 2) I didn't quite get why the property fps is defined in BaseTask? Where is it called from?

Did you test these changes on any of the existing config files? I'll test them out as well, and after that I don't see why we couldn't merge them to main. How soon do you need them merged? Of course your students can just use this branch if we encounter some problems.

fl0fischer commented 2 years ago

Sure, I updated the version numbering to the format you suggested. The fps property is used in one of our tutorials, but you're right, it does not make much sense to call it from the task module. I removed that from the BaseTask and added it to the Simulator Class (returning the fps for the main camera image taken from self._camera).

On the pull request: I'll update setup.py to use the mentioned alpha version of stable-baselines3 that is compatible with gym>=0.26, until this has been merged into the official stable baselines repo. I will also run some further tests (after copying the updated classes to the provided simulators) for training and evaluation.

fl0fischer commented 2 years ago

Both training and evaluation seem to run smoothly for all four tasks (did not run training until convergence).

aikkala commented 2 years ago

Ok, sounds good! Just to confirm, the current pre-trained models won't work with the evaluator.py in this branch? I'd rather wait until the end of next week (week 44) before merging this, in case someone at UIST wants to test out the repo. Would that work for you?

fl0fischer commented 2 years ago

The current pre-trained models (i.e., the checkpoints in the simulator directory, which I did not update) should also work with the current evaluator.py. However, I'm totally fine with merging after UIST, just to make sure nothing breaks :)

fl0fischer commented 1 year ago

switch to gym(nasium) v0.28.1 and sb3 v2.0.0a5 (groundbreaking changes, requires retraining of all policies!)