nle18 / coref-llms

4 stars 0 forks source link

where is the requirements.txt? #2

Open wuyanbo5210 opened 6 months ago

wuyanbo5210 commented 6 months ago

It seems like you forgot to put the requirements.txt file into the code repository, and the code has many issues and cannot run properly. Please fix the bug as soon as possible to ensure normal running.

nle18 commented 6 months ago

Thanks for the reminder -- I have added the requirements.txt.

In terms of issues, it would be helpful if you pointed out the specific bugs that you are having. I just re-ran the code from my local environment and it worked as expected.

wuyanbo5210 commented 6 months ago

For example, when I run your running example using Codellama, the error message shows that the model cannot be found in the Hugging Face library: OSError: codellama is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'

The model type given in the code is different from the type mentioned in the paper, how to switch between models with different numbers of parameters? : image image

And, It seems that the CoNLL2012 dataset processed by fast-coref's preprocess code cannot be used properly in this code. In the llm.py file, line 56 marks that the handling of max_generated_len is obsolete.

nle18 commented 6 months ago

For open-sourced models, you would need to download them and model_id parameter to the model folderpath. For example, you can download Codellama 7B and 34B and set the model_id accordingly.

In terms of data processing, there are a few tweaks that we had to make to map the output of fast-coref processing to our data processing. However, you should still be able to modify the codebase to fit whatever input/output format you are experimenting with.

wuyanbo5210 commented 5 months ago

For open-sourced models, you would need to download them and model_id parameter to the model folderpath. For example, you can download Codellama 7B and 34B and set the model_id accordingly.

In terms of data processing, there are a few tweaks that we had to make to map the output of fast-coref processing to our data processing. However, you should still be able to modify the codebase to fit whatever input/output format you are experimenting with.

what are the tweaks of data processing? Is it convenient to explain more or provide code?