Open HossamAmer12 opened 1 month ago
Hi Hossam, Thank you for your interest in our work.
I believe that you need to rebuild the KNN datastore specifically for distill-GPT. Have you done that?
Best, Uri
On Thu, Oct 17, 2024 at 12:38 Hossam Amer @.***> wrote:
I am trying to build on your knn-transfomers repo https://github.com/neulab/knn-transformers/tree/master?tab=readme-ov-file .
When I run the distill gpt with the given setup in the repo but with --knn flag, I get around 21.xx preplexity. This number is different than the one reported in the repository.
I am able to reproduce the other numbers for distill gpt.
Could you please let me know if you have any clue here?
— Reply to this email directly, view it on GitHub https://github.com/neulab/knn-transformers/issues/14, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSOXMHVGBTAFDUHXM4CTSTZ37RW5AVCNFSM6AAAAABQEE3NW6VHI2DSMVQWIX3LMV43ASLTON2WKOZSGU4TKMRSG44TEOA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Thanks @urialon for getting back.
The model that I was using in the previous (sorry I edited my post above) is the one given in the repo. That said, the scores are different.
Based on your suggestion, I tried building the dstore myself. everytime I get to this error: UserScriptFilledDisk: User script filled the disk. Consider using Virtual Machine SKU with larger disk size.
That's the command I used for building the datastore:
MODEL=neulab/distilgpt2-finetuned-wikitext103
path_to=""
CUDA_VISIBLE_DEVICES=0 python -u run_clm.py \
--model_name_or_path ${MODEL} \
--dataset_name wikitext --dataset_config_name wikitext-103-raw-v1 \
--do_eval --eval_subset train \
--output_dir $path_to/checkpoints/${MODEL} \
--dstore_dir $path_to/checkpoints/${MODEL} \
--save_knnlm_dstore --dstore_size 116988150
Does it require too much size?
Question: Do I have to specify the dstore size here? What does the dstore size indicate? Number of contexts? Another question. When running knn-lm + given distill gpt, should I use a specific temperature or lambda? I saw you posting on this so that we reproduce the scores.
Just want to update on the issue. Using the following did not result into the size issue:
MODEL=neulab/distilgpt2-finetuned-wikitext103
CUDA_VISIBLE_DEVICES=0 python -u run_clm.py \
--model_name_or_path ${MODEL} \
--dataset_name wikitext --dataset_config_name wikitext-103-raw-v1 \
--do_eval --eval_subset validation \
--output_dir ${path}/checkpoints/${MODEL}\_SAVE0 \
--dstore_dir ${path}/checkpoints/${MODEL}\_SAVE0 \
--save_knnlm_dstore --dstore_size 116988150
I guess that's due to the small size of the validation split (I know that's not realistic setup). Do you know the size of the training set and what our limits are?
Hi Uri,
I tried to construct the datastore with the wikitext validation set and given distill gpt model. Then run knn using the same set. The final perplexity scores are not good relative to baseline.
What could be the problem?
Even though the setup is not practical, I expected that the perplexity to be a lot better given the datastore set and eval set are the same.
That of course not being able to using the training set for knn datastore due to memory problems. I have not yet figured out the reason.
I kindly ask for your helpful advice.
Thanks, Hossam
I just replied to you in a different thread, let me if anything is still unclear.
On Sun, Oct 20, 2024 at 13:35 Hossam Amer @.***> wrote:
Hi Uri,
I tried to construct the datastore with the wikitext validation set and given distill gpt model. Then run knn using the same set. The final perplexity scores are not good relative to baseline.
What could be the problem?
Even though the setup is not practical, I expected that the perplexity to be a lot better given the datastore set and eval set are the same.
That of course not being able to using the training set for knn datastore due to memory problems. I have not yet figured out the reason.
I kindly ask for your helpful advice.
Thanks, Hossam
— Reply to this email directly, view it on GitHub https://github.com/neulab/knn-transformers/issues/14#issuecomment-2425145024, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSOXMDFFAWVEGPJ2QWOILDZ4PSV7AVCNFSM6AAAAABQEE3NW6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRVGE2DKMBSGQ . You are receiving this because you were mentioned.Message ID: @.***>
I am trying to build on your knn-transfomers repo.
When I run the distill gpt with the given setup in the repo but with --knn flag, I get around 21.xx preplexity. This number is different than the one reported in the repository.
I am able to reproduce the other numbers (baseline + retomaton) for distill gpt.
Could you please let me know if you have any clue here?