OpenSPG / KAG

KAG is a logical form-guided reasoning and retrieval framework based on OpenSPG engine and LLMs. It is used to build logical reasoning and factual Q&A solutions for professional domain knowledge bases. It can effectively overcome the shortcomings of the traditional RAG vector similarity calculation model.
https://spg.openkg.cn/en-US
Apache License 2.0
732 stars 54 forks source link

基于产品(面向普通用户)的镜像安装后,使用csv数据创建任务报错 #42

Closed hugoWLPeng closed 2 weeks ago

hugoWLPeng commented 2 weeks ago

在创建任务时报错; 创建任务使用的数据 kag\examples\supplychain\builder\data\Company.csv; 错误信息如下: 读取文档 2024-11-11 18:54:37: Start reading document... 2024-11-11 18:54:37: Read document complete bytes:56 切分文档 2024-11-11 18:54:37: Start split document... 2024-11-11 18:54:37: invoke chunk operator:CSVReader 2024-11-11 18:54:37: execute error:java.util.concurrent.ExecutionException: pemja.core.PythonException: <class 'KeyError'>: ('content',) at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) at com.antgroup.openspg.builder.runner.local.LocalBuilderRunner.failFast(LocalBuilderRunner.java:149) at com.antgroup.openspg.builder.runner.local.LocalBuilderRunner.execute(LocalBuilderRunner.java:124) at com.antgroup.openspgapp.core.reasoner.service.impl.TaskRunner$AutoSchemaTask.call(TaskRunner.java:189) at com.antgroup.openspgapp.core.reasoner.service.impl.TaskRunner$AutoSchemaTask.call(TaskRunner.java:157) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: pemja.core.PythonException: <class 'KeyError'>: ('content',) at /openspg_venv/lib/python3.8/site-packages/kag/builder/component/base._handle(base.py:80) at /openspg_venv/lib/python3.8/site-packages/kag/builder/component/reader/csv_reader.invoke(csv_reader.py:83) at pemja.core.PythonInterpreter.invokeMethodOneArgString(Native Method) at pemja.core.PythonInterpreter.invokeMethodOneArg(PythonInterpreter.java:212) at pemja.core.PythonInterpreter.invokeMethod(PythonInterpreter.java:116) at com.antgroup.openspg.builder.core.physical.process.ParagraphSplitProcessor.readFile(ParagraphSplitProcessor.java:157) at com.antgroup.openspg.builder.core.physical.process.ParagraphSplitProcessor.process(ParagraphSplitProcessor.java:65) at com.antgroup.openspg.builder.core.runtime.impl.DefaultBuilderExecutor.processRecursively(DefaultBuilderExecutor.java:72) at com.antgroup.openspg.builder.core.runtime.impl.DefaultBuilderExecutor.eval(DefaultBuilderExecutor.java:62) at com.antgroup.openspg.builder.runner.local.LocalBuilderRunner.lambda$execute$0(LocalBuilderRunner.java:106) at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640) ... 3 more

andylau-55 commented 2 weeks ago

The csv file should have id, name, and content columns Please refer to : https://openspg.yuque.com/ndx6g9/0.5/bv9zc3gyi98k0oyx#Vd0Wy

hugoWLPeng commented 2 weeks ago

Thank you, the issue was resolved after the modification. However, this setup seems to limit usage. Also, the data for this demo is provided directly in the source code. Is there any particular reason for this design choice?

andylau-55 commented 2 weeks ago

If you want to extract documents freely, you can use txt, md, or pdf files. The csv in the demo is used to demonstrate the construction of structured data based on the kag command. Currently, the product does not support this. We plan to release the structured construction capability in the next version.

hugoWLPeng commented 2 weeks ago

than you