neulab / knn-transformers

PyTorch + HuggingFace code for RetoMaton: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022), including an implementation of kNN-LM and kNN-MT
MIT License
268 stars 23 forks source link

Unsupported operand type(s) in LM evaluation #13

Closed Jing-L97 closed 5 months ago

Jing-L97 commented 5 months ago

Hi, can I ask a question regarding the evaluating kNN-LM and RetoMaton? I used the preprocessed Wikitext-103 datastores and FAISS index from gpt-2 and distilgpt-2(downloading form the link) and encountered the 'unsupported operand type(s)' issue for both conditions. Would you mind indicating some possible solutions?

Thank you very much for your kind help!

Screenshot 2024-03-30 at 20 39 32
urialon commented 5 months ago

Can you list the content of the directory checkpoints/$MODEL ?

Jing-L97 commented 5 months ago

Thank you very much for the quick reply! Here it is:

Screenshot 2024-03-30 at 21 24 18
urialon commented 5 months ago

The model name directory needs to be directly under checkpoints, without a neulab directory between them.

Jing-L97 commented 5 months ago

Thank you very much! I put the model directory under checkpoints but it returns another issue: OSError: distilgpt2-finetuned-wikitext103 is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'

Screenshot 2024-03-30 at 21 46 08
urialon commented 5 months ago

The model path is incorrect

Jing-L97 commented 5 months ago

Sorry I'm still very confused about model path. So I changed into this but it returned the same error as the very beginning. Thank you very much for your kind help!

Screenshot 2024-03-30 at 23 12 58
urialon commented 5 months ago

This is a bit hard to debug remotely, but Please send your complete directory structure, complete command line, error and stack trace, without screenshots.

On Sat, Mar 30, 2024 at 18:14 Jing Liu @.***> wrote:

Sorry I'm still very confused about model path. So I changed into this but it returned the same error as the very beginning. Thank you very much for your kind help! Screenshot.2024-03-30.at.23.12.58.png (view on web) https://github.com/neulab/knn-transformers/assets/84009338/ba6ed83b-e4d0-4371-899c-01dcb0b94508

— Reply to this email directly, view it on GitHub https://github.com/neulab/knn-transformers/issues/13#issuecomment-2028481424, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSOXMCJXDIEHOECNWIEY3LY24TM7AVCNFSM6AAAAABFPZDXU6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRYGQ4DCNBSGQ . You are receiving this because you commented.Message ID: @.***>

Jing-L97 commented 5 months ago

Hi Uriaion, thank you so much for your kind help! Here's the directory structure

knn-transformers/ |-- CITATION.cff |-- LICENSE |-- README.md |-- pycache | |-- knnlm.cpython-39.pyc | -- retomaton.cpython-39.pyc |-- checkpoints | |-- distilgpt2-finetuned-wikitext103 | | |-- all_results.json | | |-- dstore_gpt2_None_768_keys.npy | | |-- dstore_gpt2_None_768_vals.npy | | |-- eval_results.json | |-- index_gpt2_None_768.indexed | |-- gpt2 | | |-- dstore_gpt2_116988150_768_vals.npy | | |-- dstore_gpt2_None_768_vals.npy | | |-- index_gpt2_116988150_768.indexed1 | | -- index_gpt2_None_768.indexed |-- neulab | |-- distilgpt2-finetuned-wikitext103 | | |-- all_results.json | | |-- dstore_gpt2_None_768_keys.npy | | |-- dstore_gpt2_None_768_vals.npy | | |-- eval_results.json | | -- index_gpt2_None_768.indexed |-- gpt2-finetuned-wikitext103 | |-- all_results.json | |-- dstore_gpt2_None_768_keys.npy | |-- dstore_gpt2_None_768_vals.npy | |-- eval_results.json | -- index_gpt2_None_768.indexed |-- images | |-- overview.jpeg | |-- wiki.png | |-- wiki_distilgpt2.png |-- wiki_gpt2.png |-- knnlm.py |-- load_model.py |-- requirements.txt |-- retomaton.py |-- run_clm.py `-- run_translation.py

Jing-L97 commented 5 months ago

And here's the complete command line for retomaton evaluation and the datastore as well as index have been downloaded:

MODEL=neulab/distilgpt2-finetuned-wikitext103 python -u run_clm.py \ --model_name_or_path ${MODEL} \ --dataset_name wikitext --dataset_config_name wikitext-103-raw-v1 \ --output_dir checkpoints/${MODEL} \ --do_eval --eval_subset validation \ --dstore_dir checkpoints/${MODEL} --retomaton

Jing-L97 commented 5 months ago

And here's the error and stack trace after running the commands above. Thank you so much for your kind help!

03/30/2024 19:29:38 - WARNING - main - Process rank: 0, device: cuda:0, n_gpu: 4distributed training: True, 16-bits training: False /projectnb/tin-lab/jliu8/.conda/envs/RAG/lib/python3.9/site-packages/accelerate/accelerator.py:432: FutureWarning: Passing the following arguments to Accelerator is deprecated and will be removed in version 1.0 of Accelerate: dict_keys(['dispatch_batches', 'split_batches', 'even_batches', 'use_seedable_sampler']). Please pass an accelerate.DataLoaderConfiguration instead: dataloader_config = DataLoaderConfiguration(dispatch_batches=None, split_batches=False, even_batches=True, use_seedable_sampler=True) warnings.warn( 03/30/2024 19:29:44 - WARNING - accelerate.utils.other - Detected kernel version 4.18.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher. 03/30/2024 19:29:44 - INFO - retomaton - No member files found in checkpoints/neulab/distilgpt2-finetuned-wikitext103, not using clustering 03/30/2024 19:29:50 - INFO - knnlm - Reading datastore took 6.4537224769592285 s 03/30/2024 19:29:55 - INFO - knnlm - Moving index to GPU took 4.194196462631226 s

Traceback (most recent call last): File "/projectnb/tin-lab/jing/knn-transformers/run_clm.py", line 649, in main() File "/projectnb/tin-lab/jing/knn-transformers/run_clm.py", line 589, in main knn_wrapper.break_into(model) File "/projectnb/tin-lab/jing/knn-transformers/knnlm.py", line 134, in break_into self.reconstruct_index, self.index = self.setup_faiss() File "/projectnb/tin-lab/jing/knn-transformers/knnlm.py", line 107, in setup_faiss self.vals = np.memmap(f'{keys_vals_prefix}_vals.npy', dtype=np.int32, mode='r', File "/projectnb/tin-lab/jliu8/.conda/envs/RAG/lib/python3.9/site-packages/numpy/core/memmap.py", line 249, in new size = k TypeError: unsupported operand type(s) for : 'int' and 'NoneType'

Jing-L97 commented 5 months ago

Hi ,I have solved this issue. It turns out that I didn't specify the datastore_size. It would be great if you could add the argument in --dstore_size 116988150 in Step4 as well. Thank you very much for your kind help!

urialon commented 5 months ago

Thank you for catching this!