HKUDS / LightRAG

"LightRAG: Simple and Fast Retrieval-Augmented Generation"
https://arxiv.org/abs/2410.05779
MIT License
9.62k stars 1.19k forks source link

extract package reference #201

Closed solomem closed 6 days ago

solomem commented 3 weeks ago

Can you please provide a package reference used in the extract code? Thanks

Sorry I meant the Amazon Textract example given:

import textract

file_path = 'TEXT.pdf'
text_content = textract.process(file_path)

rag.insert(text_content.decode('utf-8'))

The module textract, have you included it anywhere in the code? Many thanks

LarFii commented 2 weeks ago

I’m not quite sure I understand what you need. Could you describe it in more detail?

WangAo-0 commented 2 weeks ago

我不太确定我是否理解了您的需求。您能更详细地描述一下吗?

textract这个包如何安装,似乎有冲突 image

solomem commented 2 weeks ago

我不太确定我是否理解了您的需求。您能更详细地描述一下吗?

textract这个包如何安装,似乎有冲突 image

I was thinking the same, https://pypi.org/project/textract/#history is definitely not the package used.

minglai1994 commented 2 weeks ago

I tried pip install textract==1.5.0, it works fine for me.