microsoft / graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system
https://microsoft.github.io/graphrag/
MIT License
17.33k stars 1.65k forks source link

[Issue]: <title> how to use csv with multiple columns #1025

Open JadeXIN opened 3 weeks ago

JadeXIN commented 3 weeks ago

Do you need to file an issue?

Describe the issue

i have seen the source code csv.py for loading csv, but I see that it can only handle csv file with source and text columns. but in practice, the csv files often contain many columns. how to handle this type of csv file with rag. how to modify the settings.yaml file.

Steps to reproduce

No response

GraphRAG Config Used

# Paste your config here

Logs and screenshots

No response

Additional Information

PetricaR commented 3 weeks ago

virus?

JadeXIN commented 3 weeks ago

virus?

it seems you are correct....who knows the solution

natoverse commented 1 week ago

What are you looking to do with the additional columns? If you have multiple fields of text that you want as input to the graph extraction, you can concatenate them in a pre-processing script. But we don't currently support any form of metadata/column-based filtering, so I'm not sure what the use case is.

github-actions[bot] commented 3 days ago

This issue has been marked stale due to inactivity after repo maintainer or community member responses that request more information or suggest a solution. It will be closed after five additional days.

matbee-eth commented 2 days ago

Curious what the point of supporting CSV is if you cant select which columns to use?