Closed zachdaniel closed 6 years ago
@zachdaniel I am back to thinking that the best solution here is still to use database views. You can get all of the guarantees, including checks on insert/update.
Creating a view is very straight-forward. The only annoying part is that a view is not automatically updated when you add columns to the underlying table, so we need a mechanism to make it easier to update. That can be as simple as putting the views in a separate file that is required on the migration. Some guidance around here is definitely necessary.
@josevalim We use views extensively to share logic between a legacy Rails app and the newer Elixir app, and to handle complex queries that aren't possible or easy with Ecto. All migrations are done on the Rails side (because the app is older and the schema_migrations table is incompatible), but that means we've been able to use scenic to handle view migrations.
Scenic does precisely what you've mentioned here:
I'm sure this could be done manually in Ecto, or with a package, but it would be nice to have support for view creation built in.
As you mention, the view isn't updated when columns are added, which bites people new to views every single time. I'm not sure if there could be a mechanism to detect when table changes would effect a view, but an automated warning would be useful.
The problem with GROUP BY
makes views a deal breaker for us.
@zachdaniel which problem? That you can't define a view with group_by? That shouldn't apply to this issue though, this issue is exclusively about supporting where
.
I mentioned it in a comment above:
There are also some limitations when it comes to using a table vs a view. With a table, postgres knows about primary keys and allows you to only group by the primary key, even when you want additional rows that come with the record.
e.g
SELECT
posts.id, posts.name, posts.author_id, COUNT(comments.id)
AS comment_count
FROM posts
JOIN comments ON comments.post_id = posts.id
GROUP BY posts.id
That above query would work with a table, but it would not work with a view, even if the definition for that view was just SELECT * FROM posts
Basically postgres does you favors when it knows that your grouping by a primary key, but you lose that ability when it's a view.
This issue isn't really the only deal breaker, but it's an example of how views don't serve as perfect stand-ins for tables.
Basically postgres does you favors when it knows that your grouping by a primary key, but you lose that ability when it's a view.
I would not use something so implicit anyway, explicit is always better and that doesn't look very ANSI SQL'y...
But the point is that your queries become resilient to the addition of new columns. If you have a query that is selecting all of the columns of one table, and additionally grouped up data on related rows(a pattern we use all the time), then with a view you have to use GROUP BY every, fieldz, ever
which has actual performance impacts, and requires you to change every query when you add a column.
which has actual performance impacts, and requires you to change every query when you add a column.
Which in my cases would only be in one location. I use a set of builder modules for building queries (I so cannot wait for named joins whooo!).
Well in our case it would be in considerably more places than one location :)
Well in our case it would be in considerably more places than one location :)
Sounds like a redesign is really in order... ^.^;
/me cannot stand to scatter things around all over the place, prefers centralization and module/function APIs
This issue is that redesign. A central set of query logic would have given us no benefit before. Now that we have a requirement for something like this, I would rather have it implemented as an ORM feature than build an interface on top of the existing query language. We have over 70k LoC and it's easy to say that a redesign is necessary without context.
Additionally, @OvermindDL1 by my reading of the ANSI SQL standard for group by + the postgres documentation, it conforms to the standard just fine.
I've been bitten quite a few times by the default_scope
in rails, and although I know the intention here isn't to fully replicate it; but if we provide it people will end up misusing it as such.
If materialized views aren't sufficient, what about doing it the simple way and wrapping the Repo
module or using a macro? That way you could even pattern-match the query if needed, apply your filters, then delegate.
Have you implemented this as a plugin, @zachdaniel?
Creating a mixin module (mixed-in after Ecto.Schema) overriding the function definition of __MODULE__.__schema__(:query)
might solve this problem for you.
@isaacsanders I tried originally, but the problem is that not enough ecto internals use __schema__(:query)
. That is bypassed often.
What could potentially get us most of the way there (I know a widely used open source library needs to consider more than one use case), would be the ability to modify the query for associations. What that would mean for us, is that we can get the usage of assoc
back. Currently we manually write out all the association joins fully, because we need to make sure that the joined table honors our system that we have set up to give each schema its own base query. The system we have is simple, just a behaviour that says that they define a .query()
. If we could define a custom query that would be used by assoc/2
in queries like so:
many_to_many :tags, MyApp.Tag, join_through: "posts_tags", query: MyApp.Tag.query()
Then all we have to remember is to ensure that the from
of our queries, or any schemas explicitly use the .query()
function.
So
from post in MyApp.Post.query(),
join: post_tags in "post_tags",
on: post_tags.post_id == post.id,
join: tag in ^MyApp.Tag.query(),
on: post_tags.tag_id == tag.id
Would become:
from post in MyApp.Post.query(),
join: tag in assoc(post, :tags)
That would be very nice for us!
@zachdaniel we would be happy with supporting this. The only limitation is that the given query
can only have where
clauses and the same source as the one given to the association. Aything else should fail. Would you be interested in tackling this with our guidance?
It's not clear to me, though, if the MyApp.Tag.query()
function would be called at runtime or compile-time. As it is right now, I would expect it to be called at compile-time looking at this code.
@michalmuskala it would be compile-time, yes.
In that case, couldn't we do:
many_to_many :tags, MyApp.Tag.query(), join_through: "posts_tags"
We could extract the struct name from the query.
@michalmuskala oh, actually yes. The way we have structured associations internally is already meant to support things like that. :)
@josevalim the limitation of only allowing where clauses is fine (especially in the short term). I do think that adapaters that support subqueries shouldn't really have an issue supporting joins to associations that have a select statement, but I can advocate for that at a later time :D
Oh, and yes I'm very interested in taking this on!
How do you generally test equality of queries generated via from
? They have information on line numbers and things like that. I could easily nil those out, but that seems a little less than idea. Alternatives could be inspecting and asserting equal, or just asserting the fields about it that I care are equal, but I wanted to get your thoughts on it. I'm looking for other examples currently.
@zachdaniel get the underlying code expressions and call Macro.to_string/1
on it.
@josevalim for specifically this one, I defined the queries in a separate module and then used them in that in the schema, and then when fetching the assoc
I can confirm they are equal via ==
since they both come from the same place, which works for now.
We are closing this as we cannot agree on supporting it. There are also alternatives, such as using views, and #2472 brings this feature to associations, which simplifies things.
The start of the discussion is in the google group here: https://groups.google.com/forum/#!topic/elixir-ecto/dLsumjvNJ2I
The proposal for the interface:
Updates would need to be made to joins, from, and other places to ensure that these where clauses are always added.
I think the way forward there is to first modify everything to use
__schema__(:query)
, and then set it up such that this change alters that query.