Closed bitwise-aiden closed 4 months ago
Another option may be to have scopes to when the compilers are run. The main 3 being Tapioca -> Gems -> Project
. I have yet to think of an instance where it is beneficial to run a compiler before another, so having this hierarchy could solve the problem just as well. The reasoning being that:
With this setup we can make sure that if a project wanted to override the signatures of methods generated in either of the first two then it would be more than capable of doing so.
For the different signature use case is it enough to run custom compilers after default ones?
For the different signature use case is it enough to run custom compilers after default ones?
Yes, I think providing the way to run them after would likely hit 99+% of use cases. Separating them out would hopefully fix that.
I like your third idea -- seems like the least amount of work and covers the most use cases!
I like your third idea -- seems like the least amount of work and covers the most use cases!
Me too, feels like the best approach regardless.
I'd like to question the premise of the issue, though. What is a real life example where 2 compilers want to type the same method in different ways?
I'd be wary of solutionizing before we understand if the problem is really a problem.
The main reason I have is for a compiler local to a project needing to override the result of the compiler in Tapioca because there are some additional special cases inserted by the project that aren't known / should not be handled by Tapioca.
As a concrete example, overriding something like the #find
method of ActiveRecord to take an extra type that isn't taken by default.
What approach would you take for addressing that?
This should allow you to solve that: https://github.com/Shopify/tapioca/pull/1967
I also added a DSL CLI test that shows how to use the extensibility point inside a custom "compiler".
Although it fixes this one example, I do think that there is very likely a need for others to override behaviours. I think the layered tapioca > gems > project
ordering is probably a good quality of life feature so fixes don't have to be upstreamed.
Although it fixes this one example, I do think that there is very likely a need for others to override behaviours. I think the layered
tapioca > gems > project
ordering is probably a good quality of life feature so fixes don't have to be upstreamed.
None of those should require RBI generation time modification. All modifications should be doable at load time when compilers are being loaded.
Btw, for the find
signature, I am actually pretty unhappy with the fact that we've limited the set of id types artificially, when it is obvious that that isn't enough. So, I generally think this is a short-coming of the compiler itself.
I remember that we had a problem with overloads not working as expected when one of them was T.untyped
but I think if we are careful about the overload sig ordering, we can still use T.untyped
:
# typed: __STDLIB_INTERNAL
class Foo
extend T::Sig
sig { params(a: T::Array[T.untyped]).returns(Integer) }
sig { params(a: T.untyped).returns(String) }
def self.foo(a)
T.unsafe(nil)
end
end
T.reveal_type Foo.foo(1) # Revealed type: String
T.reveal_type Foo.foo([1, "a"]) # Revealed type: Integer
T.reveal_type Foo.foo([[1, "a"]]) # Revealed type: Integer
seems to work just fine and is a more "correct" signature for the method.
@KaanOzkan do you remember why we went the way of specifying all these id types?
Ah, found the problem: https://github.com/Shopify/tapioca/pull/1799#discussion_r1511362645
# typed: __STDLIB_INTERNAL
class Foo
extend T::Sig
sig { params(a: T::Array[T.untyped]).returns(T::Enumerable[Foo]) }
sig { params(a: T.untyped).returns(Foo) }
def self.foo(a)
T.unsafe(nil)
end
end
array_of_untyped = T.let(T.unsafe(nil), T::Array[T.untyped])
untyped = T.let(T.unsafe(nil), T.untyped)
T.reveal_type Foo.foo(untyped) # Revealed type: T::Enumerable[Foo]
T.reveal_type Foo.foo(array_of_untyped) # Revealed type: T::Enumerable[Foo]
What happens if you change the order of the signatures?
None of those should require RBI generation time modification. All modifications should be doable at load time when compilers are being loaded.
Exactly and that is the issue that I'm having, we load the compilers, order them alphabetically, then run them. What I'm proposing is that we load them, order them by context AND alphabetically, then run them.
Something like:
TAPIOCA = 0
GEMS = 1
LOCAL = 2
Runtime::Reflection.descendants_of(Compiler).sort_by do |compiler|
context = context_for_compiler(compiler)
[context, T.must(compiler.name]
end
What happens if you change the order of the signatures?
The main issue I was having is that two compilers generate two separate methods and can't impact the order of the sigs, so whichever one is read last will be the one that's sigs are applied. e.g.
class SomeModel
module CommonRelationMethods
sig { params(args: SomeIdType).returns(SomeModel)
def find(args = nil, &block); end
sig { params(args: T.untyped).returns(T.untyped)
def find(args = nil, &block); end
end
end
The sig with SomeIdType
will never be used.
Exactly and that is the issue that I'm having, we load the compilers, order them alphabetically, then run them.
What I was trying to say, but couldn't word properly, was that the compiler run order should be immaterial for properly constructed DSL compilers. That's why I am slightly opposed to implementing a certain kind of run order for compilers.
The main behaviour of DSL compilers should have been settled at load time, before we get to running. When we run the compilers, their behaviours should not depend on each other or on any runtime ordering (which implies the same thing).
What happens if you change the order of the signatures?
I am assuming that was question about the order of T::Array[T.untyped]
and T.untyped
in the signatures that I sent, @bitwise-aiden.
That doesn't change much it still matches the first signature that matches:
# typed: __STDLIB_INTERNAL
class Foo
extend T::Sig
sig { params(a: T.untyped).returns(Foo) }
sig { params(a: T::Array[T.untyped]).returns(T::Enumerable[Foo]) }
def self.foo(a)
T.unsafe(nil)
end
end
array_of_untyped = T.let(T.unsafe(nil), T::Array[T.untyped])
untyped = T.let(T.unsafe(nil), T.untyped)
T.reveal_type Foo.foo(untyped) # Revealed type: Foo
T.reveal_type Foo.foo(array_of_untyped) # Revealed type: Foo
and I think actually understand the problem now.
The argument being T.untyped
obviously will match the first signature that is listed, regardless of what types the signature has. This ordering of signatures in the previous post is the correct one which works properly for typed arguments, but then untyped cases match the one with the array argument (listed as the first sig), which returns an enumerable of Foo
, which is not the correct assumption for the majority of the cases.
So, to make the default behaviour with T.untyped
be the expected common behaviour of returning a Foo
, we end up listing out all the expected types to remove T.untyped
from the signature, which allows us to flip the order of the signatures, so that the first one can be the one without the array, and thus the common usage of the method.
The main behaviour of DSL compilers should have been settled at load time, before we get to running. When we run the compilers, their behaviours should not depend on each other or on any runtime ordering (which implies the same thing).
Okay, yeah. I'm with you on that. I think there are a rich set of additional tooling that would help to create signatures across compilers, but with what we have I think there is a bit of a short coming.
I think there are a rich set of additional tooling that would help to create signatures across compilers, but with what we have I think there is a bit of a short coming.
That might be true, but I think we should address that as a problem on its own and not as a problem of DSL compiler ordering.
For example, we could create a create_or_modify_method
helper for DSL compilers which they could use that exclusively and has some kind of conflict resolution semantics built in. That might be another way for DSL compiler to work collaboratively.
Basically, I'd like to keep the solution space wide enough to allow those kinds of solutions without committing to any ordering guarantees.
For example, we could create a create_or_modify_method helper for DSL compilers which they could use that exclusively and has some kind of conflict resolution semantics built in. That might be another way for DSL compiler to work collaboratively.
+1 for this. I was definitely wanting for this over the weekend.
I am closing this issue, since the specific ask is, in my opinion, not something we want and the root cause for needing this was addressed in #1967.
If you would like to open another issue with concrete examples of DSL compilers needing cross-compiler collaboration, we could discuss the possible approaches in there.
Description
At present when compilers for DSL generation are loaded they are alphabetized and run in that order:
https://github.com/Shopify/tapioca/blob/8678e2ab7b853ccbf3f092ccf553e58773eaa38b/lib/tapioca/dsl/pipeline.rb#L115-L123
What this effectively means is that if two compilers create the same method with different signatures, whichever compiler was run last will be the one to define the signature used. If someone were wanting to write a different signature than a Tapioca defined one, their only recourse would be to name their compiler something that would come after
Tapioca::Dsl::Compilers::*
alphabetically.Having the ability to push the compiler forward or back in the run order would make it possible to have the explicit control over the order rather than relying on discovering the above code to then know to name it something that puts it in the right place. Naming is already hard enough, but name golf to get your compiler to run in order although fun seems a little counter intuitive.
Solutions
Order value
One solution would be to have a
order
value which is an integer representing where it is expected to run. This could then update the code above to order based on this value first, then alphabetically next to ensure a stable order run. The order value could be an integer between 0-100 and by default if not set, the could be 50 to allow for room either side.This would then allow projects to set the value of their compilers based on what they are needed and even override ones from their gems/Tapioca to ensure the order needed for them.
Method override flags
Given that the heart of the problem is that a method needs to be defined after an existing to override it's signature, maybe when generating it, optional flags can be passed to determine how it interacts with existing / future methods of the same name. It would have to be able to handle future cases too because of the aforementioned ordering alphabetically.