coryodaniel / arbor

Ecto elixir adjacency list and tree traversal. Supports Ecto versions 2 and 3.
MIT License
239 stars 26 forks source link

Support dynamic switching of schema prefixes #23

Open devonestes opened 5 years ago

devonestes commented 5 years ago

Howdy!

So right now when we're using Arbor, the fragment/1 call in ancestors/1 and descendants/1 are essentially hard coded at compile time because of the restrictions of fragment/1.

However, it would be really nice if they could use the prefix in the given's struct's :__meta__.prefix field so we're always searching for ancestors and descendants in the same schema as the given struct.

Since all of these strings used in fragment/1 need to be compiled and not interpolated, my initial thought is to allow users to configure a number of prefixes that they want to compile functions for, sort of like this:

for prefix <- opts[:prefixes] do
  def descendants(%{__meta__: %{prefix: unquote(prefix)}} = struct, depth \\ 2_147_483_647) do
    from(
      t in unquote(definition),
      where:
        t.unquote(opts[:primary_key]) in fragment(
          unquote("""
          WITH RECURSIVE #{opts[:tree_name]} AS (
            SELECT #{opts[:primary_key]},
                    0 AS depth
            FROM #{prefix}.#{opts[:table_name]}
            WHERE #{opts[:foreign_key]} = ?
          UNION ALL
            SELECT #{opts[:table_name]}.#{opts[:primary_key]},
                    #{opts[:tree_name]}.depth + 1
            FROM #{prefix}.#{opts[:table_name]}
              JOIN #{opts[:tree_name]}
              ON #{opts[:table_name]}.#{opts[:foreign_key]} = #{opts[:tree_name]}.#{
            opts[:primary_key]
          }
            WHERE #{opts[:tree_name]}.depth + 1 < ?
          )
          SELECT #{opts[:primary_key]} FROM #{opts[:tree_name]}
          """),
          type(^struct.unquote(opts[:primary_key]), unquote(opts[:foreign_key_type])),
          type(^depth, :integer)
        )
    )
  end
end

Then there could be at the end the same definition that you have now with no pattern matching.

So, sound like something that makes sense to you? If so, I'll send along a PR.

coryodaniel commented 5 years ago

fragment/1 does support positional interpolation. I am not positive if postgres supports a relation name/prefix in a prepared statement.

coryodaniel commented 5 years ago

It is not supported by postgres to do that. I'm going to put a comment on slack to get some opinions on how best to handle this from the Ecto team.

Ref: https://github.com/coryodaniel/arbor/issues/12

devonestes commented 5 years ago

@coryodaniel Yeah, I realized while I was implementing the solution I came up with that interpolation wouldn't work. I opened up #24 with a solution that's working for us in production. It requires folks to designate the prefixes that they're going to compile functions for in their configuration, so it's not totally dynamic, but it's at least solving the current issue that I encountered at work which led me down this path.

If there's a different way that the Ecto team can think of to solve this that didn't require listing your prefixes at compile time that would be great!

coryodaniel commented 5 years ago

In your app do you have s fixed number of prefixes with the same schema?

I’m working on a multi-tenant app (prefix per customer) now that is going to need this library soon. I’d like to figure out if there is a way to configure this at run time.

I wonder if there is a route down the raw SQL path without exposing injection opportunities.

Thanks,

Cory O’Daniel

On Feb 28, 2019, at 2:49 AM, Devon Estes notifications@github.com wrote:

@coryodaniel Yeah, I realized while I was implementing the solution I came up with that interpolation wouldn't work. I opened up #24 with a solution that's working for us in production. It requires folks to designate the prefixes that they're going to compile functions for in their configuration, so it's not totally dynamic, but it's at least solving the current issue that I encountered at work which led me down this path.

If there's a different way that the Ecto team can think of to solve this that didn't require listing your prefixes at compile time that would be great!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

coryodaniel commented 5 years ago

So I might have an idea how this could work.

If the fragment’s string was built at run time, the SQL identifiers could be interpolated into the elixir string and the arguments (record ID) could still use prepared statement placeholders.

This would require the tree functions not being macros, and calling a function on the module to get arbor options at runtime to interpolate into the elixir string.

If we wanted to keep the functions as macros we could add a shadow function that does the elixir interpolation and return it to the macro so the macro doesn’t have to fetch the options from some place on each call.

Update: this doesn't work. Some magic macro stuff happens in Ecto to see if a fragment's string has been interpolated at runtime, and emits a SQL injection warning. :(

coryodaniel commented 5 years ago

The issue is only with the fragment functions, correct? roots, parent, children, and siblings work as expected?

coryodaniel commented 5 years ago

Was trying to be slick... This didnt work. :( Invalid syntax using a query as a prepared statement I suppose. Happens @ (?)

        table_name_with_prefix = Arbor.CTE.prefixed_table_name(
          struct.__meta__, unquote(opts[:source])
        )

        local_opts = unquote(opts)

        query1 = from(table_name_with_prefix,
          select: fragment("?, ?, 0 AS depth", ^local_opts[:primary_key], ^local_opts[:foreign_key]),
          where: ^[
            {local_opts[:primary_key], struct.unquote(opts[:primary_key])}
          ]
        )

        query2 = from(table_name_with_prefix,
          select: fragment("?.?, ?.?, ?.depth + 1",
            ^table_name_with_prefix, ^local_opts[:primary_key],
            ^table_name_with_prefix, ^local_opts[:foreign_key],
            ^local_opts[:tree_name]
          ),
          join: ^local_opts[:tree_name],
          on: fragment("? = ?", ^"#{local_opts[:tree_name]}.#{local_opts[:foreign_key]}", ^"#{table_name_with_prefix}.#{local_opts[:primary_key]}")
        )

        recursive_union = union_all(query1, ^query2)

        from(t in unquote(definition),
          join:
            g in fragment(
              unquote("""
              (
                WITH RECURSIVE #{opts[:tree_name]} AS (?)
                SELECT *
                FROM #{opts[:tree_name]}
              )
              """),
              ^recursive_union
            ),
          on: t.unquote(opts[:primary_key]) == g.unquote(opts[:foreign_key])
        )
devonestes commented 5 years ago

Yes, we do have a fixed number of schemas, which is why #24 works for us.

That PR most likely won't work for a multi-tenant app since (in my experience) most of those tenants are generated dynamically during the execution of the application.

The only way I know of to do this dynamically at runtime is to use something like Ecto.Adapters.SQL.query/4, but that won't give users the composibility that's so nice with the current implementation.

The issue isn't just with the fragments, actually, because when you do from(t in unquote(definition), #...) that will set the prefix if there's a @schema_prefix defined on that schema, and once it's set it cannot be overridden later on. The prefix needs to be set at the first time you use from or join if there's a @schema_prefix on the schema.

coryodaniel commented 5 years ago

Yeah I was down the raw query route, but didnt want to give up composing.

Are you using a fork in your code now? I'd hate to block you.

I'll probably end up merging by the end of this weekend. I'm going to take a swing to see if I can work something crazy up with windows since it seems like thats supported by ecto.

If not I'll merge and see if I can add CTE support to ecto and circle back to the runtime solution later.

I was able to get all of the functions working at runtime except the CTE ones!

devonestes commented 5 years ago

Yeah, we're using that fork for now, so no worries about blocking.

Window functions might work here - good thinking!

coryodaniel commented 5 years ago

https://github.com/elixir-ecto/ecto/pull/2757

CTE support is coming!

I’m going to start a refactor using this branch. Macro API will remain the same.

coryodaniel commented 5 years ago

CTE has landed in dev: https://github.com/elixir-ecto/ecto/pull/2757

glennr commented 4 years ago

Hi folks. Bumping this discussion as I'm interested in how to get this library working with Triplex?

coryodaniel commented 4 years ago

I started a branch to refactor using CTEs, but have had zero time outside the office for work lately.

Note, I’ve only refactored two functions. I can try to finish it up this weekend, depends how babynaps go!

https://github.com/coryodaniel/arbor/pull/31

sveredyuk commented 2 years ago

+1 for this issue. Can I help somehow? My app is SaaS PG schema-based I need this too much