THUDM / LongWriter

LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
Apache License 2.0
1.02k stars 85 forks source link

agentwriter plan.py missing file instructions.jsonl #23

Closed toninog closed 2 weeks ago

toninog commented 2 weeks ago

System Info / 系統信息

Python 3.11.4, CUDA Version: 12.4, transformers Version: 4.44.0

Who can help? / 谁可以帮助到您?

No response

Information / 问题信息

Reproduction / 复现过程

steps to reproduce

  1. cd agentwrite
  2. python3 plan.py

Traceback (most recent call last): File "~/LongWriter/agentwrite/plan.py", line 91, in with open(in_file, encoding='utf-8') as f: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ FileNotFoundError: [Errno 2] No such file or directory: 'instructions.jsonl'

Expected behavior / 期待表现

the plan to be created

bys0318 commented 2 weeks ago

instructions.jsonl has to be a file that contains all your prompts (inputs) that are ready to use agentwrite to obtain the long-output responses. Format each line as {"prompt": "xxx", ...}.

toninog commented 2 weeks ago

Thanks!

ameza13 commented 2 weeks ago

Hello @bys0318 , could you please share yourinstructions.jsonl file?

Thanks,

bys0318 commented 1 week ago

Hi, you can download the instruction data file of LongWriter-6k from hf. Delete the generated response from it, and use our agentwrite code to get the long-output responses. Feel free to optimize the pipeline for your use case!

toninog commented 1 week ago

here is my example of the instructions.jsonl

{ "prompt": "Write a 10000-word essay about the roman empire"}