microsoft / graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system
https://microsoft.github.io/graphrag/
MIT License
18.74k stars 1.83k forks source link

Documentation request - add links to sample CSV / step to convert text to input CSV #133

Open erjadi opened 6 months ago

erjadi commented 6 months ago

Apologies if this is just my lack of understanding, but going through the getting started tutorial, there seems to be a step missing?

We download a book from project gutenberg in text format, and then we start the indexer. However the indexer expects CSV files in the input folder and we only have the book .txt file.

I checked the dulce.csv file that also in the repo to transform my input into something acceptable, but I think either:

would help people who are starting out.

chiragshah285 commented 4 months ago

Totally agree here, in the same boat

natoverse commented 3 months ago

For some time there was a bug in the config such that the Gutenberg txt wasn't working. This has been fixed. So I think this can probably be closed, but I'll give it a few days to see if folks chime in with a similar issue.

DOliana commented 1 week ago

I'll chime in here then @natoverse :-) A lot of the examples in the /examples folder expect a directory /examples/_sample_data to load stuff from. This directory is non-existant though. The code seems to expect csv-files which I couldn't find in the repository, so I assume the data is also missing and might be the same as what was mentioned above.