Closed thomas475 closed 4 months ago
Hi thomas,
Sorry for the confusing! In Uni-RLHF, we collected a total of three types of feedback labels, with the majority being Comparative and Attribute feedback labels.
The Comparative feedback labels have already been provided in this repository. The Attribute feedback labels were only used for basic experiments in the Uni-RLHF paper and were later expanded into the complete paper AlignDiff. The baseline algorithm named TD3BC+Pref in AlignDiff is the same as in the Uni-RLHF attribute experiments. So all the Attribute source labels and processing procedure have been open-sourced here.
As for Keypoint feedback, we regret that we only collected a small portion of the labels for preliminary experiments in the appendix. We are currently expanding this into a formal engineering project, and all labels will be open-sourced at that time.
Thank you for your attention. We look forward to working together with the community to improve this.
I will close this issue and pin it to the main page until I have time to update the readme.md
, feel free to reopen it if you have more question!
Dear authors,
thank you for the outstanding work on this project!
In your paper, you mention that the system was evaluated using comparative, attribute, and keypoint feedback. However, it seems that the labeled data for attribute and keypoint feedback is missing from the repository. Additionally, the code for handling these types of feedback for reward learning isn't included either.
Could you please provide the missing data and code? It would also be great if you could publish the code for evaluative and visual feedback, if available. This would be incredibly helpful for my project, where I am working with your system on the integration of multiple feedback types.
Many thanks!