-
I wonder whether torchtune can support traditional tasks such as translation or more general text generation tasks which have a input and output column. I have read the datasets doc at [here](https://…
-
Hi, currently reward_fn is independent from environment class (mbrl.models.ModelEnv) and accepts as input actions and next observation. In practice more general, dependent on environment parameters re…
-
### Describe the feature
The PPO training needs to maintain 4 models in memory at the same time. The original implementation keep the reward/actor critic/initial model in video ram at the same time.
…
-
*This is an excerpt of main thread for RoboSats PRO* https://github.com/Reckless-Satoshi/robosats/issues/177#issuecomment-1289175371
### Toolbar for RoboSats PRO
A simple component with a few but…
-
Select a series of models to be used in the project. They will be fine-tuned, architecturally manipulated (i.e., replacing the last layer for reward model), and RLHF will be performed on all models.
-
Hi,
Thank you to the LeRobot community for maintaining such a fantastic codebase. My research group and I have greatly benefited from your efforts. In my current project, I am using the repository …
-
Hello, I followed the steps outlined in "InstructVideo (CVPR 2024)." I'm trying to run the evaluation step: bash configs/instructvideo/eval_generate_videos.sh but I encounter the error below. I checke…
-
Dear authors,
Thanks for the amazing work. Recently I followed the expert actions that I extract from `get_info()` function from the class `AlfredThorEnv`, however, the success rate is only slight…
-
# Why
#### As a
user of `pyCMO`
#### I want
to be able to specify different reward models for my scenarios
#### So that
I can train RL agents
# Acceptance Criteria
#### Given
we currently only expo…
-
I converted a llama model to nemo, with model dirs like below:
![image](https://github.com/NVIDIA/NeMo-Aligner/assets/6756880/2d36915a-a0ab-4c1a-8d20-0960a7948bdc)
When I tried to load it to train a…