Open Nsigma-Bill opened 1 year ago
I have a question regarding the function preprocess in datasets/sft_dataset.py: Line 51 goes like this:
preprocess
datasets/sft_dataset.py
inpt = [1] + s_tokens + t_tokens + [2]
I am a bit confused about why we add target(label) information in the input and did not mask this. To me, it seems like label information leakage.
target(label)
input
Could you clarify this a bit? Thanks!
I have a question regarding the function
preprocess
indatasets/sft_dataset.py
: Line 51 goes like this:I am a bit confused about why we add
target(label)
information in theinput
and did not mask this. To me, it seems like label information leakage.Could you clarify this a bit? Thanks!