Closed jongjyh closed 1 year ago
Hi there, you should provide evidence and be responsible for what you say.
The exact usable_idxs
is used to index activations, how come "do not seem to match this index"?
Thanks, KL
Let's take an example of two-fold cross-validation.
First, the data is computed according to the truthfulQA-mc2 order in Huggingface and saved, which we call the first set of indices.
Then, the data is shuffled during training, and although the same indices are used, we call it the second set of indices because it points to completely different data in order.
df = df.sample(frac=1, random_state=args.seed).reset_index(drop=True)
The test set is then [420, 840], and training set and val sets are [0,419]. However, there is a problem where the training set is read from the previously saved npy file using the original indices, which could cause issues.
For instance, if 1st data point is shuffled to the 450th position in the second set of indices, it should be used as a test data point. However, when we read activations, we still use the index 1 to fetch(even though it has been moved to test set) and train probes, and when we test we fetch 450th questions which is exactly the same with index 1 in .npy file and it could lead to leakage. This is my understanding of the code, which may differ from the actual execution. Please correct me if I am wrong, and I will delete this question immediately.
Thanks.
Hi, thanks for detailing the problem! I just pushed an update to this repo that will sort the loaded CSV file from TruthfulQA repo to be the same as huggingface order, from which the features are saved from.
I ran some experiments and the results don't change much, perhaps because there are too few learnable parameters (~6k if K being 48) to overfit.
Congratulations! :)
I have a heartless request and wish you could help me. I tried to replicate the results from paper, but fail to get the results with ITI. I basicily followed the instructions of repo.
Here is what I did:
# get activations.
CUDA_VISIBLE_DEVICES=3 HF_DATASETS_OFFLINE=1 python3.8 get_activations.py llama_7B tqa_mc2
# validations
model="llama_7B"
CUDA_VISIBLE_DEVICES=0 python3 validate_2fold.py $model --num_heads $head --alpha $alpha --device 0 --num_fold 2 --judge_name $true --info_name $info
I got ITI and baseline(without any intervention) results like:
Name | State | Notes | User | Tags | Created | Runtime | Sweep | activations_dataset | alpha | dataset_name | device | eval | fp16 | info_name | judge_name | model_name | num_fold | num_heads | offline | seed | use_center_of_mass | use_coef | use_prefix | use_random_dir | val_ratio | CE Loss | Info Score | KL wrt Original | MC1 Score | MC2 Score | True Score | True*Info Score |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
llama_7B_seed_42_top_48_heads_alpha_15_fp32 | finished | - | jongjyh | 2023-07-03T07:05:57.000Z | 595 | tqa_gen_end_q | 15 | tqa_mc2 | 0 | TRUE | FALSE | curie:ft-personal-2023-06-25-10-39-37 | curie:ft-personal-2023-06-25-11-44-57 | llama_7B | 2 | 48 | TRUE | 42 | FALSE | FALSE | FALSE | FALSE | 0.2 | 2.13329798 | 0.966953713 | 0 | 0.25582782 | 0.405372826 | 0.305992018 | 0.295880118 | ||
llama_7B_seed_42_top_48_heads_alpha_15_com_fp32 | finished | - | jongjyh | 2023-07-02T09:23:13.000Z | 5 | tqa_gen_end_q | 15 | tqa_mc2 | 0 | TRUE | FALSE | curie:ft-personal-2023-06-25-10-39-37 | curie:ft-personal-2023-06-25-11-44-57 | llama_7B | 2 | 48 | TRUE | 42 | TRUE | FALSE | FALSE | FALSE | 0.2 | 2.400817971 | 0.962048756 | 0.294551133 | 0.272975694 | 0.425765185 | 0.304835443 | 0.293266558 |
Did I miss anything?
Thanks!
Hi, here is what I get from running my code with the default hyper-parameters, averaged over seed 1 through 5.
True | Info | MC1 | MC2 | CE | KL | |
---|---|---|---|---|---|---|
w/ ITI | 0.4482981 | 0.92875617 | 0.2883893 | 0.45113669 | 2.40703174 | 0.26517357 |
w/o ITI | 0.31580193 | 0.96695072 | 0.25705031 | 0.40542086 | 2.16346875 | 0. |
From the information you gave me, it's hard to guess what you have missed, isn't it? But anyways, hope you agree that the data leakage problem has been fixed.
Sure, thank you for your quickly following! it's an interesting work indeed! : )
Hi, here is what I get from running my code with the default hyper-parameters, averaged over seed 1 through 5.
True Info MC1 MC2 CE KL w/ ITI 0.4482981 0.92875617 0.2883893 0.45113669 2.40703174 0.26517357 w/o ITI 0.31580193 0.96695072 0.25705031 0.40542086 2.16346875 0. From the information you gave me, it's hard to guess what you have missed, isn't it? But anyways, hope you agree that the data leakage problem has been fixed.
Hello, how do you get the results for w/o ITI, do you manually put intervensions = {} in alt_tqa_evaluate function?
Also i have another question, how do you save the new model after changing the activations direction ?
Hello, I've been recently trying to reproduce the results from the paper, and while inspecting the code, I found a potentially incorrect implementation of cross-validation. Could you please help me verify if this issue indeed exists?
Replicate the problem
Firstly, you generated a random index in
validate_2fold.py
and then you fetched the activation values from the saved activation file according to this newly generated index.
However, the fetched activation values do not seem to match this index, and it may potentially fetch data from test set you just split . I believe this might lead to data leakage.