Open bcksl opened 1 year ago
manage_relationship
that kind of thing. So if you had a polymorphic belongs to it would add one belongs to under the hood for each relationship, and synthesize the operations on top of it. Then we may not need to add any additional relationship types, but instead increase the functionality available to things like calculations and add changes to handle union inputs.There's already short_name which could be used for this purpose, however I'm okay with storing the module name in there as long as we...
Although stringifying the module name benefits from guaranteed global uniqueness, I would prefer to decouple the module name from the stored type string. This has benefits for module refactoring, connecting legacy data, hinting for the migration generator etc.
I can't think of any way that someone could execute arbitrary code via this mechanism but hopefully 1 and 2 above make this impossible.
I don't see any data or RCE vulns either way, since this section of the query is built entirely by Ash, and isn't really exposed.
The one place I would see potential issues is if we're directly atomizing a stringified module name from the database, which is a bad idea.
Assuming we support the current functionality of filters etc. later these would simply be duplicated for every subquery.
If we were to add a types
filter for polymorphic relationships, this would similarly only narrow the set of types from the relationship as an intersection operation (and/or subtraction operation, if we allow not :type
), but still wouldn't allow specifying additional arbitrary types, so no data safety issue.
So if you had a polymorphic belongs to it would add one belongs to under the hood for each relationship, and synthesize the operations on top of it.
This is essentially the alternative solution from the end of the original post. It's workable—and it's actually not mutually exclusive to the *_type
solution, whichever road we decide to go down first.
I will say that one major advantage we get from using *_type
is removing a lot of migrations that would be necessary with a logical monomorphic relationship for each type.
From a performance perspective, I suppose that a new union calculation type could be optimized by datalayers such as AshPostgres essentially the same way that a new relationship type could. Since I believe many calculations can be turned into subqueries by AshPostgres, maybe this will work even for chains of polymorphic relationships?
From an ergonomics perspective I have some concerns. I suppose we could add sugar on top of the proposed improvements to alleviate some/all of these, but would you mind writing up how you envision the resulting DSL to eventually look for the end user?
I will say that one major advantage we get from using *_type is removing a lot of migrations that would be necessary with a logical monomorphic relationship for each type.
True, but at the same time we haven't really talked about the guarantees that people would lose without doing it that way (AFAIK there is no way to make a foreign key setup to handle this).
From a performance perspective, I suppose that a new union calculation type could be optimized by datalayers such as AshPostgres essentially the same way that a new relationship type could. Since I believe many calculations can be turned into subqueries by AshPostgres, maybe this will work even for chains of polymorphic relationships?
Yeah, still would need to figure out the details, but we have the capability to do things like that.
From an ergonomics perspective I have some concerns. I suppose we could add sugar on top of the proposed improvements to alleviate some/all of these, but would you mind writing up how you envision the resulting DSL to eventually look for the end user?
It could look the same as we have it currently TBH
relationships do
polymorphic_belongs_to :.... # expands to multiple regular belongs_to relationships under the hood, that we synthesize.
end
Another note:
If we integrate a union
operation into Ash core/the data layer concept, we could leverage that under the hood potentially?
This is a very sticky problem. I'm down to make it work, but its going to take quite some time to iron out all the details.
Agreed, for SQL datalayers specifically, we have much stronger data integrity checks with the column-per-type strategy using foreign key
and the additional "exactly one"/"at most one" constraints outlined at the bottom of the original post.
Regardless of whether we decide to promote poly_belongs_to
to a new relationship type or as belongs_to
under the covers, there are tradeoffs.
Considering that these are the only two standard strategies for polymorphism in relational design that don't require the addition of intermediary tables, would you find it reasonable to eventually implement both and allow the user to choose based on the tradeoffs?
For the column-per-type strategy, do you foresee any issues with reifying the list of types in time to perform the transformation to multiple belongs_to
relationships if we decide to add auto-discovery of some kind and the concrete types are not specified directly in the relationship definition?
I could see a case for being able to choose between the two, yeah. If we do auto discovery, that would likely cause compile time issues, yeah. We'd need the user to keep a module somewhere separate that describes the options, the way we do Ash.Type.Enum
and Ash.Type.NewType
today.
I'm 100% on board with adding a union
calculation type, regardless of whether it is the foundation of some/all of the polymorphic relationships or not.
As you folks are moving towards rewriting the Ash engine using Reactor, I suspect adding the *_type
attribute-based polymorphic relationship strategies as well as a bunch of other things on the roadmap, such as parameterized relationships will become significantly more straightforward. The same applies to potentially making the column-per-type polymorphism strategy a native relationship type in the future.
In the meantime, if you think it might be reasonable to add a union
calculation, we could try to build the column-per-type polymorphic relationships based on that. I guess the union calculation needs to support adding tags with the appropriate value for each subquery?
FWIW, we have the Ash.Type.Union
already, and building a calculation that produces a union in that way is actually relatively straightforward today.
defmodule MyApp.UnionOfThings do
use Ash.Type.NewType, subtype_of: :union, constraints: [
types: [
one: [
type: ResourceOne,
tag: :type,
tag_value: "one"
],
two: [
type: ResourceTwo,
tag: :type,
tag_value: "two"
]
]
]
end
defmodule UnionCalculation do
use Ash.Calculation
@relationships [:one, :two]
def load(query, _, _) do
@relationships
end
def calculate(records, _, _) do
records
|> Enum.map(fn record ->
Enum.find_value(@relationships, fn relationship ->
case Map.get(record, relationship) do
nil -> nil
value -> %Ash.Union{type: relationship, value: value}
end
end)
end)
end
end
There is a small bit more you typically want to do if you are using AshGraphql
, but otherwise that right there is essentially enough.
Great, super straightforward. Even fleshing that out, we basically just have:
defmodule Ash.Calculation.BelongsToUnion do
@opts_schema [
[
name: :relationships
type: {:list, :atom}
required: true
]
]
@impl Ash.Calculation
def init(opts \\ []), do: Spark.OptionsHelpers.validate(opts, @opts_schema)
@impl Ash.Calculation
def load(_query, opts, _context) do
opts[:relationships]
end
@impl Ash.Calculation
def calculate(records, opts, _args) do
records
|> Enum.map(fn record ->
Enum.find_value(opts[:relationships], fn relationship ->
if Map.get(record, relationship) do
%Ash.Union{type: relationship, value: value}
end
end)
end)
end
end
poly_belongs_to
can desugar into:
Ash.Calculation.BelongsToUnion
;Ash.Resource.Relationships.BelongsTo
for each of the types specified;Ash.Type.NewType, subtype_of: :union
.Even ergonomics-wise the user can specify before?/1
for any transformers that would like to add/manipulate poly_*
relationships, which was one of my major concerns for doing things this way. Might be worth considering adding some helpers, a :generated
tag to the monomorphic relationships' DSLs, or something, to assist in determining whether a relationship is stand-alone or generated.
This gets us to reading, and for writing manage_relationship
needs to be extended to recognize poly_belongs_to
relationships and do the right thing. Users can still do whatever they want as far as reading from- and writing to the underlying belongs_to
relationships etc.
I'll note that manage_relationship
needs to enforce the at-most-one constraint as well, i.e. it needs to clear an existing value of another type if there is one when writing to the relationship.
Is there actually more that needs to go into this for poly_belongs_to
?
If not, then the calculations for poly_has_{one,many}
are nearly the same as Ash.Calculation.BelongsToUnion
, except for the fact that they are performing a union of lists rather than single related records, and Ash.Calculation.HasOneUnion
would need to re-perform the sort, if specified, outside the datalayer.
We completely avoid the need for manage_relationship
to care whether the relationship is bidirectionally polymorphic or not with the column-per-type strategy for poly_belongs_to
. We do not, however, avoid the need for manage_relationship
to perform the same kind of constraint enforcement when writing to a poly_belongs_to
through any of the other polymorphic relationship types, which is another argument in favour of the helpers/DSL tag mentioned above.
We could even just tell people that managed relationships don't work with polymorphic relationships, and that they need to roll their own inputs and handlers.
I suppose, but surely being able to both read and write using the tools people are used to is a big part of the draw. Do you see some major points of friction for making manage_relationship
understand union input and the poly_*
relationships?
No issues necessarily, just might be a place to start. I.e read only unions first.
Right on. I'll start with that and see how we go.
I will say that I would really like there to be some helpers a la strip_metadata
that would be able to turn a list of poly_*
Ash.Union
s from/to the underlying structs rather than a map with a :type
key, and some options for code_interface
and Ash.Api
to perform these transformations automatically.
Given frontends and some internals already understand Ash.Union
, the value in basing things on that for everything else is pretty clear.
Honestly I wouldn't suggest trying to automatically (or even w/ an option) pull things out of the union struct TBH. I think it is much better to force consumers to handle the fact that this is a union (and could therefore be one of many different types).
On Mon, Jul 31, 2023 at 8:44 PM, bcksl < @.*** > wrote:
Right on. I'll start with that and see how we go.
I will say that I would really like there to be some helpers a la stripmetadata that would be able to turn a list of poly* Ash.Union s from/to the underlying structs rather than a map with a :type key, and some options for code_interface and Ash.Api to perform these transformations automatically.
Given frontends and some internals already understand Ash.Union , the value in basing things on that for everything else is pretty clear.
— Reply to this email directly, view it on GitHub ( https://github.com/ash-project/ash/issues/563#issuecomment-1659399205 ) , or unsubscribe ( https://github.com/notifications/unsubscribe-auth/ABLVBYYSGTZ3KK6FOXPEYDTXTBGPZANCNFSM6AAAAAAW5Y3Q6Y ). You are receiving this because you commented. Message ID: <ash-project/ash/issues/563/1659399205 @ github. com>
Ah, could you could elaborate a bit on that? From what I understood, Ash.Type.Union
was developed as a convenience for embedded types and Ash.Union
is still
@type ‰__MODULE__{
type: term(),
value: term()
}
This is obviously a better structure if types can be things like :string
or :integer
as well, but if this is a relationship they're all guaranteed to be resource instances.
Ash.Type.Union
produces an %Ash.Union{}
. In general, to support this, we'd need to support a general "calculation union unwrapping" of some kind (because we don't know what calculations are producing unions that can only be resources), which in general I think leads down a path of hiding important information. It's the same reason we retain %Ash.CiString{value: "SoMe StRiNG"}
, so that context isn't thrown away.
Regardless, utilities to unwrap these values could be added later, or could be added into someone's app as a tool, like read() |> MyApp.unwrap_unions() |> ...
I've had polymorphic relationships on my wishlist, and looked into the
Ash.Type.Union
+calculation
approach and thought maybe things could go a bit deeper. Realistically, this would combine well with resource behaviors, which is a topic unto itself.In terms of an efficient
AshPostgres
implementation, it would not be dissimilar to existing relationships, but, in addition topayment_method_id
, there would also be an enumpayment_method_type
to describe the possible types.load
operations follow this pattern:The same transformation is applicable to
many_to_many
relationships, including those that are bidirectionally polymorphic.Elixir is quite a good candidate for this kind of thing, because it is type-agnostic enough to allow you to return heterogeneous lists, but makes it easy to work with them by pattern-matching the struct. More than that, first-class support for polymorphism would be a huge boon to describing a lot of domain models.
Nested loads must be handled with care internally (resource behaviors addresses this), but the engine can complain if it doesn't have enough information to determine whether it can perform a further
load
on the result types, or different loads could be specified for different types, but this starts to get messy quickly.A less-efficient (non-
join
ed) approach when the relationship is polymorphic might be a first stage strategy.In any case, I wanted to open the discussion on this and see what immediate challenges and benefits come to mind.
Other data-layers
Alternative
join
approachIf desired, an alternative approach is to use a column for each type:
Which would require a
constraint check
(forallow_nil?: true
, this would be1 >=
):