Add optional recursive query look up

michaelStillwell commented 1 year ago

Changes

Added new optional flag to cli.rs --recursive
Updated generate_live to look for the new flag and use read_query_modules_recursive is set to true
Added new read_query_modules_recursive function as a clone of read_query_modules to look for directories and if they are found to run find_queries to locate all of the files
Added new find_queries function to locate all of the files in a directory and if a directory is found, to call itself with that path

This was something I came across when I started using cornucopia for my personal project. I wanted to try to keep my query files as close to where it was going to be used but I wasn't able to find a way to do that as it is. So I made some changes to look into the directories that are found in the given path so you can have a structure like this:

├── src
│   ├── product
│   │   ├── mod.rs
│   │   ├── product.sql
│   ├── user
│   │   ├── mod.rs
│   │   ├── user.sql
│   ├── main.rs

instead of:

├── queries
│   ├── product.sql
│   ├── user.sql
├── src
│   ├── product
│   │   ├── mod.rs
│   ├── user
│   │   ├── mod.rs
│   ├── main.rs

I'm not sure if this was wanted or needed but it fits my use case and works for me so I thought I would try to share it.

jacobsvante commented 1 year ago

I personally think this should be the default (and only way) to read the queries directory.

LouisGariepy commented 1 year ago

I'm on board with recursive query look up.

Before talking about the implementation, I think we should decide if we're keeping the non-recursive version.

There two minor cases where recursive lookup would be unwanted.

Inside your designated query folder, you also have non-cornucopia SQL files in subdirectories. (i.e. picking up sql files that shouldn't be picked up).
Inside your designated query folder, you have large and deep subdirectories. (i.e. tanking performance traversing useless stuff).

I'm not too concerned about these points since its unlikely that a user would have this setup in the first place, but its not entirely impossible that the user would have "regular" SQL files and cornucopia files located in children of the same folder (src/, for example).

A more pressing issue is, how to handle query module name conflicts? With a single directory approach, all files necessarily have distinct names (that's how filesystems work), but with a recursive approach, its possible that two different modules would end up having the same name. For example:

foo/
|--- bar/
|    |--- my_queries.sql
|
|--- baz/
|    |--- qux/
|         |--- my_queries.sql

One way to solve this would be to use the diverging parent directory's name for disambiguating, which would yield bar_my_queries and baz_my_queries. Its kind of tricky to find the diverging parent directory though, especially when the conflicting files could be deeply nested at different levels, and even trickier if there are more than two conflicting files. At least I don't know such an algorithm off the top of my head, but I doubt it'd be nice and simple. Perhaps Path/PathBuf have some built-in methods for this, I really don't know.

Another potential solution is to simply give an error when a duplicate is found. Might feel restrictive and arbitrary for users, but if the error message is good, its probably the easiest solution.

I'd like to resolve these questions before moving forward with this PR, so please let me know what you think :smiley:

jacobsvante commented 1 year ago

@LouisGariepy What about creating sub-modules to reflect the real directory structure that the query files reside in?

I.e. foo/bar/my_queries.sql would appear as:

mod foo {
    mod bar {
        mod my_queries {
            ...
        }
    }
}

michaelStillwell commented 1 year ago

Personally I think using sub-modules would be a solid option for resolving duplicate files. For the issue of "normal" sql files, we could use the validation of searching for --! in the file to determine if the file is a cornucopia file or not. Another idea would be maybe requiring the sql files to have .c.sql extension or something similar to mark it as a cornucopia sql file, but I'm not sure how that would go with backwards compatibility -- legacy sql files would have to changed to be recognized.

LouisGariepy commented 1 year ago

Submodules could be a solution. The work to enable this falls a bit outside of this PR though (since its also closely related to #176 and #186), but with query submodules implemented I'd be in favor of completely replacing the current query lookup with the recursive version.

we could use the validation of searching for --! in the file to determine if the file is a cornucopia file or not. Another idea would be maybe requiring the sql files to have .c.sql extension or something similar to mark it as a cornucopia sql file

I'm not sure which way is best yet, but these are conceivable solutions.

jacobsvante commented 1 year ago

Submodules could be a solution. The work to enable this falls a bit outside of this PR though (since its also closely related to #176 and #186), but with query submodules implemented I'd be in favor of completely replacing the current query lookup with the recursive version.

Sounds like a reasonable approach!

we could use the validation of searching for --! in the file to determine if the file is a cornucopia file or not. Another idea would be maybe requiring the sql files to have .c.sql extension or something similar to mark it as a cornucopia sql file

I'm not sure which way is best yet, but these are conceivable solutions.

At first I wanted to vote for going the --! route, but now that I think about it I can see a point in that (most) cornucopia SQL files are not actually valid SQL because of the contained variables. .c.sql would still be opened as an SQL file. Maybe .cornucopia or .copiasql makes more sense? Then one can associate it with the preferred language mode in the code editor (at least VS code allows us to save the setting).

cornucopia-rs / cornucopia

Add optional recursive query look up #183

Changes