better trl parser with yaml config

correctly overrides yaml config with command line arguments do this by changing the defaults to values in the yaml so we get dataclass default < yaml config < command line arguments
adds return_remaining_strings
when return_remaining_strings is False, raises error if yaml contains extra args that are not in the dataclasses
removes the need to have a config field in one of your dataclasses (I can revert this if this is preferred)
simpler and cleaner than previous yaml parsing without the need for merging dataclasses

Fixes #1733

@younesbelkada let me know if you have comments or suggestions!

huggingface / trl