usethesource / rascal

The implementation of the Rascal meta-programming language (including interpreter, type checker, parser generator, compiler and JVM based run-time system)
http://www.rascal-mpl.org
Other
403 stars 77 forks source link

Do we really need circular imports? #624

Closed mahills closed 4 years ago

mahills commented 10 years ago

Currently we allow circular imports in Rascal. This is useful in some cases, but definitely makes the structure of tools like the checker more complex, since it isn't possible to (for instance) first check imported modules, and then check the importing module. Although I don't believe we do this (I'm going to check), it is possible to make fairly complex circular dependencies between the modules right now, for instance, by defining a type A in module M1 which depends on type B in module M2 which depends on type C in module M1. So, this is an open question: do we want to support these? If our only use case is with mutually recursive data type definitions, we do have abstract definitions, but this is more work for the Rascal developer. Thoughts?

mahills commented 10 years ago

Yes, to some extend I'm asking this to make my life easier, but there is also a question about whether this adds to conceptual complexity or opens the door to unusual error cases. If so, and if we continue to support this, we will want to add checks for specific circular cases that could cause problems, such as aliases defined in terms of one another.

PaulKlint commented 10 years ago

My first inclination is to make an inventory of the important use cases and see whether we can restrict circular imports in such a way that our lives (of Rascal developers and users) becomes simpler. I think the following items should at least be allowed to be spread out across different modules:

The motivation is that we want to be as expressive as possible to make Rascal programs as structured and extensible as possible: we want good tools for solving the expression problem.

However, this list already covers nearly all kinds of declarations except aliases and annotations so this does not help much to reduce complexity. Maybe we should therefore invert our approach and ask the question: are there restrictions that we can we impose on circular imports that:

tvdstorm commented 10 years ago

@PaulKlint Re the expression problem: this requires extend, don't know if these are/can/should be circular.

mahills commented 10 years ago

I'm still thinking this through, but I think the truly problematic cases are ones that involve aliases and type parameter bounds. For the latter, we are restricted here anyway -- Tree is really the only ADT we can properly use as a bound, since we don't have a subtyping relation between datatypes. For aliases, this is tricker -- we could have an alias point to another type, defined in another module, which itself is an alias that points to another type defined in another module, etc. In the worst case, we could have a multi-step circular definition that is meaningless, but even in meaningful cases this could create a circularity at the level of modules.

For extend, I think we don't want circularity -- in my opinion this would be an error, since extend is basically a copy of the extended module into the extending module.

PaulKlint commented 10 years ago

Indeed, extend non-circular and serious restrictions on aliases and type bounds. So let's start with a strong constraint: all elements of an alias declaration or type bound should be defined in the same module or in a non-circularly imported module:

mahills commented 10 years ago

That should help. For now I'm focusing on tweaking the current version; after that, I'll look at caching results of running the checker (or using something like M3 to represent the results) and using those results directly, versus what is done right now, which is to extract signatures from imported modules and use those to populate the starting type and name environments. I'll probably add an additional check to only import names (the names in fcvEnv in the checker) that are used in the current module, which should reduce errors where types are not in scope but are used in signatures.

tvdstorm commented 10 years ago

I think we should avoid optimizing the implementor's work (sorry @mahills!), but optimize manual size (smaller = better). This means as few arbitrary restrictions as possible. Circular extend, as far as I can see, does not make sense (like circular inheritance feels off). But for import, we should aim for generality.

tvdstorm commented 10 years ago

Re extend: it's not a copy, -- that's how it's implemented. It's like inheritance.

mahills commented 10 years ago

Although I'm happy to have my work optimized :) my bigger concern is wanting to make sure the semantics make sense. For import, I think circular is fine as long as we have some clear rules for what is allowed, and we have checks to make sure these aren't violated. From a practical point of view, this does make the implementation harder, but probably of more concern is it may also make it slower. What I'm doing now, using type signatures for the modules, helps with this some, but has its own problems -- types visible in these imported declarations may not be visible in the importing module, so we need to do additional work to either flag these and figure out where they come from (so we can type it) or add additional logic to catch these problems and ignore them if possible.

For extend, I realize that copying is just the implementation, but was just getting at the point you were getting at -- it doesn't make sense to say "module A includes everything in B, which includes everything in C, which includes everything in A, which includes...".

swatbot commented 10 years ago

How about: aliases can only be defined in terms of non-aliases? — Jurgen J. Vinju CWI SWAT INRIA Lille UvA master software engineering http://jurgen.vinju.org

On Wed, Jul 30, 2014 at 10:56 PM, mahills notifications@github.com wrote:

Although I'm happy to have my work optimized :) my bigger concern is wanting to make sure the semantics make sense. For import, I think circular is fine as long as we have some clear rules for what is allowed, and we have checks to make sure these aren't violated. From a practical point of view, this does make the implementation harder, but probably of more concern is it may also make it slower. What I'm doing now, using type signatures for the modules, helps with this some, but has its own problems -- types visible in these imported declarations may not be visible in the importing module, so we need to do additional work to either flag these and figure out where they come from (so we can type it) or add additional logic to catch these problems and ignore them if possible.

For extend, I realize that copying is just the implementation, but was just getting at the point you were getting at -- it doesn't make sense to say "module A includes everything in B, which includes everything in C, which includes everything in A, which includes...".

Reply to this email directly or view it on GitHub: https://github.com/cwi-swat/rascal/issues/624#issuecomment-50678509

mahills commented 10 years ago

That would be fine with me, I have a fix point process right now for computing the type, just in case we have chains. These chains are actually one of the more problematic features -- it makes it possible to "leak" types out across a number of modules.

swatbot commented 10 years ago

it would be slightly annoying to have to copy paste nested aliases into the definition, but in terms of reuse and abstraction we are not losing much. The aliases leak the structure of the aliased type anyway in any code that uses the values. So I would find this an acceptable restriction and one that is easy to predict and check.

On Thu, Jul 31, 2014 at 3:41 PM, mahills notifications@github.com wrote:

That would be fine with me, I have a fix point process right now for computing the type, just in case we have chains. These chains are actually one of the more problematic features -- it makes it possible to "leak" types out across a number of modules.

Reply to this email directly or view it on GitHub https://github.com/cwi-swat/rascal/issues/624#issuecomment-50759669.

Jurgen Vinju

mahills commented 10 years ago

We can leave it in for now (it's already there), but this "linking" is a real problem -- it can easily introduce a dependency on a type that hasn't been imported. Circularity is an issue in theory, but I'm fairly sure nobody has ever accidentally defined a circular chain of aliases. I'll try to run a query over the existing libraries today and see if we have any of these chains, if people are defining aliases in terms of other aliases fairly regularly now it's probably something we want to continue to support.

jurgenvinju commented 10 years ago

well, we could also have a "quick fix" for the error that an aliased is referenced and simply include it there and then in the source text. After the restriction there is no more than one level of indirection at edit time..

for people that currently use long chains of aliases, it will be a matter of applying the same quick fix a number of times until the entire chain is flattened.

On Thu, Jul 31, 2014 at 4:21 PM, mahills notifications@github.com wrote:

We can leave it in for now (it's already there), but this "linking" is a real problem -- it can easily introduce a dependency on a type that hasn't been imported. Circularity is an issue in theory, but I'm fairly sure nobody has ever accidentally defined a circular chain of aliases. I'll try to run a query over the existing libraries today and see if we have any of these chains, if people are defining aliases in terms of other aliases fairly regularly now it's probably something we want to continue to support.

— Reply to this email directly or view it on GitHub https://github.com/cwi-swat/rascal/issues/624#issuecomment-50764968.

Jurgen Vinju