whyhow-ai / knowledge-table

Knowledge Table is an open-source package designed to simplify extracting and exploring structured data from unstructured documents.
MIT License
337 stars 47 forks source link

[FEATURE] Add in RDB (probably Postgres) #22

Open tomsmoker opened 1 month ago

tomsmoker commented 1 month ago

What

Add a DB to the project to store each knowledge table.

Why

We currently just store in memory for ease of use. Works fine, but for larger workloads or analytical usecases it would be good to have persistent backend storage.

Implementation guidance

Add DB into dockerfile (probably postgres), add an ORM (probably tortoise) and some migrations (probably alembic).

tomthebuzz commented 1 month ago

Just from my experience I also would look at sqlalchemy as it has a more native support for alembic. If you should choose tortoise then you likely want to look at aerich for the migrations part. I have found that while sqlalchemy has a bit a steeper learning curve it pays off well once you need the more advanced features.

Just my 5 pcs.

tomsmoker commented 1 month ago

Thanks @tomthebuzz. I planned to use aerich and tortoise, as I've used them for a postgres setup in the past. Another benefit of postgres is the pgvector extension. Can you help me understand what the advanced features are that sqlalchemy offers? I've used it before in a flask project, but not sure I've ever used it to the full extent, and would be great to know how it can be extended.