Closed elaaaf closed 6 months ago
cc @alvarobartt
Hi here @elaaaf! So we are indeed implementing the "self-curation" step defined in https://arxiv.org/pdf/2308.06259, which evaluates the quality of an instruction-completion pair as seen in the table from the paper shown below.
Also note that the inputs of InstructionBacktranslation
are both the instruction
and the generation
which are replaced in the prompt template shown above as <generated_instruction>
and output
, respectively.
Here's the code from the documentation but with code-comments to show the existing inputs on each stage in case this is more clear 👍🏻
from distilabel.llms import InferenceEndpointsLLM, OpenAILLM
from distilabel.pipeline import Pipeline
from distilabel.steps import LoadHubDataset
from distilabel.steps.tasks import InstructionBacktranslation, TextGeneration
with Pipeline(name="self-alignment-with-instruction-backtranslation") as pipeline:
# inputs: none
# outputs: instruction
load_hub_dataset = LoadHubDataset(
name="load_dataset",
output_mappings={"prompt": "instruction"},
)
# inputs: instruction
# outputs: generation, generation_model
text_generation = TextGeneration(
name="text_generation",
llm=InferenceEndpointsLLM(
base_url="<INFERENCE_ENDPOINT_URL>",
tokenizer_id="argilla/notus-7b-v1",
model_display_name="argilla/notus-7b-v1",
),
input_batch_size=10,
output_mappings={"model_name": "generation_model"},
)
load_hub_dataset.connect(text_generation)
# inputs: instruction, generation
# outputs: score, reason, scoring_model
instruction_backtranslation = InstructionBacktranslation(
name="instruction_backtranslation",
llm=OpenAILLM(model="gpt-4"),
input_batch_size=10,
output_mappings={"model_name": "scoring_model"},
)
text_generation.connect(instruction_backtranslation)
We are also aware that are parts that are not covered in the distilabel
implementation, so what you are asking is whether we could implement those? Is there anything in particular you're interested in? Just let us know and we can clarify and extend the current implementation, as yes, we're only implementing the self-curation step from this specific paper.
e.g. we implement 2 but 1 is missing as of the docs 1. Self-augment: Generate instructions for unlabelled data, i.e. the web corpus, to produce candidate training data of (instruction, output) pairs for instruction tuning.
We are also aware that are parts that are not covered in the
distilabel
implementation, so what you are asking is whether we could implement those? Is there anything in particular you're interested in? Just let us know and we can clarify and extend the current implementation, as yes, we're only implementing the self-curation step from this specific paper.
Thank you so much for the clarification @alvarobartt ! I don't actually need any additional parts implemented at this time. It's just that I spent a couple of hours understanding the code. It would be really helpful if you could update the docs to explicitly state that the current implementation covers only the self-curation step from the paper. It would definitely help anyone reading the page.
Fair @elaaaf! We'll do so, as well as implementing the remaining before the next release, since it can also add value! Thanks for opening the issue, we'll close it once the docs are updated and that the integration is extended to cover the whole paper!
I've just fixed the documentation to explain that we only implement the self-curation part @elaaaf, I'll try to find some time in the upcoming weeks to add the full reproduction instead, but for the moment the docs are clarified now! Thanks :)
Which page or section is this issue related to?
https://distilabel.argilla.io/latest/sections/papers/instruction_backtranslation
What are you documenting, or what change are you making in the documentation?
Hello, Thank you for your great work. I came across this page that implements the backtranslation paper. And I found that you use the prompt as an input where it should be the completion. Is this a bug? or was the code only for validating the prompt, if so, it's best if the documentation should state that it's only for validation.