sigi-cli / sigi

Sigi - a tool for organizing
https://sigi-cli.org/
GNU General Public License v2.0
50 stars 4 forks source link

Storage and Data structures #14

Open stranger-danger-zamu opened 2 years ago

stranger-danger-zamu commented 2 years ago

Data Structs

You have comments about using a std::collections::Vec, from what I know it's a perfectly serviceable stack. You could spring for std::collections::VecDeque instead so you can properly cycle (ie. moving the top most to the bottom) instead of swapping the first and second items.

You could also spring for a std::collections::BTreeMap where the key would be the position in the stack and the BTreeMap would naturally just sort the items for you. Moving items in the stack becomes more expensive if it isn't a limited operation (ie. swapping the first two items or cycling the top to the bottom since you can just get the bottom most key and "decrement" from there). But everything is a trade off.


Storage

Currently you store every item as a JSON object in a JSON array and write that to disk. There was some comment about scaling beyond 10k items (or stacks, I can't remember), but that might just be disk access. You could glue everything into one big JSON object and only read and/or write once per CLI evocation. You could just glue the related stacks together since you don't need to deal with cross stack transfers outside of related stacks.

For anything more robust, you probably are best off utilizing SQLite. Note, you don't have to do anything fancy for the schema, you could just do a single key-value table where the key would be "equivalent" to the file path and the value would be the content. You do get transactions so concurrent access is viable of you set journal_mode to WAL.

booniepepper commented 2 years ago

Thanks again for the interest in my little project!

For data structures and storage, I've been playing around with ways to represent a stack-based database over in my https://github.com/hiljusti/kamajii project (proof of concepts so far, although several of them already do the core push/pop/list functionality). The idea I'm arriving at over there is to have a daemon running that can serve as both a persistence layer and a memory cache. Using SQLite is something I've definitely considered, but I have a broader vision for what a stack-based database paradigm can do. (Although Sigi is definitely my first candidate for a client)

Once I make more progress over there, I think instead of loading entire stacks into memory and manipulating them, I'll do simple transactions for actions like create, fire-and-forget for something like next that needs to deeply traverse, and use IO streams for anything large like list.

This should also open up a possibility of decentralized data. (E.g. I could run a server for my lists, and use Sigi from my laptop, smartphone, other deices, etc. and have them all sync)

I'm open to feedback here though, I'm not sure if I want to close just yet.

stranger-danger-zamu commented 2 years ago

I think that having a data persistence interface would be great for Sigi (the organization tool). It'd allow you to separate out the data persistence mechanics from the Sigi UI code. You could have the interface for the backends either expose a repository's stacks and have Sigi operate on the interfaced stacks (eg. item = pop_from($stack), push_onto( $target_stack, item)) or expose higher level APIs which operate on the repository ignoring the internals of the backend (eg. pop_push(from=$stack, to=$target_stack)).

Supporting other backends such as the local file system or SQLite would be super helpful since sometimes a user doesn't want to run a server or daemon. Or they are already running other servers and would like to reduce overhead on constrained environments (eg. reusing a Redis instance on a Raspberry PI rather than starting another process).

On the other hand, I totally get just sticking to your stack-based database idea. Have you looked at Redis for the server? I'm pretty sure it's has most of the functionality you want and if you want custom operations you can implement them via Lua scripts.