The next feature I intend to work on is making the static type checker more robust when dealing with public functions and classes. I hope to figure out a way to split this into multiple incremental pull requests for ease of review.
If first program A is compiled and then program B is compiled and then the program A is changed such that B would have failed to compile if the changes happened before B was compiled, the state of program B should immediately automatically change from "compiled" to "syntax error". In the current version all compiled programs stay compiled. This may cause run-time errors or even crash the game.
If A is changed such that B can compile again, B should immediately automatically change back to "compiled".
This change should also fix https://github.com/colobot/colobot/issues/1701 because programs with missing dependencies change state to "syntax error" before the game is saved allowing loading to correctly recreate the saved state
Open this to see the rationale behind this behaviour
---
We have a statically typed language. Due to the language having public functions and public classes type checking of one program may depend on another.
A scenario:
1. Program A defines a public function Foo()
2. Program B Uses the function
3. Program A is edited in such a way that if the edit was made before compiling B, B would fail to compile
What should happen when we run B?
Options:
1) We can't run B: syntax error
2) B becomes frozen in time when it is compiled - it always uses a snapshot of A before the edit
3) We can run B. It uses the updated version of Foo() if it can. In the above scenario it can't. It falls back to using a saved snapshot of Foo()
4) We can run B, but calling Foo() throws a run-time error
5) We can run B and can call Foo(). It will execute the edited version of the function (if a suitable overload is available). The type of the return value may be wrong. If the type of a value at run-time differs from the type it had at compile-time and we try to use an operation on it that was allowed by its compile-time type, but is not allowed by its run-time type, we throw a run-time error (e.g. dividing `string` by `point` throws an error). The type `void` allows no operations. Any missing functions become no-ops that return `void`. If the number or types of arguments are incorrect, this is a missing overload - same as a missing function.
6) We can run B and can call Foo(). It will execute the edited version of the function (if available). Instead of throwing run-time errors any illegal operations produce a value with the run-time type `void`. We somehow define what should happen when `void` is used in the condition of an if statement or a while loop.
* 6 is the worst for debugging. Some code will keep working. Some will mysteriously not work. There will be no errors to help the players figure out why. There is a risk of incorrectly typed values being written to the shared state (such as static class fields) causing endless debugging fun ;)
* 5 is worse than 4. Firstly it's worse because it leaves the player to scratch her head if the number or types of arguments are incorrect "I am calling the function. Why is nothing happening?". Secondly it's worse because we delay the run-time error until the return value is used. This may leave the players puzzled about where the incorrectly typed value came from.
* 4 will waste player's time. The player still has to fix the error, but she will only find out about it when B calls `Foo()`. This may be several minutes or even half an hour after the program starts running.
* 3 is worse than 2: it makes players think they can edit A and the changes will affect B, but sometimes it silently breaks this expectation.
* 2 will cause confusion. Players will often forget to recompile B and be confused why their changes to A do nothing
* I think the option 1 "fail fast" is the most user-friendly and will cause the least amount of confusion.
---
This will require refactoring how the compiler works:
When a program is created, deleted or edited:
1) We collect:
For every public function defined in the edited program
name
names of parameter types
name of the return type
For every public class defined in the edited program
name
name of parent
names of fileds and names of their types
names of methods, names of return types of methods and names of parameter types of methods
This information does not depend on other programs
This information is sufficient to type-check programs that depend on this program
2) We redo the type-check of all programs
Possible optimization: if neither the previous nor the new version of the edited program defines public functions or classes, we don't have to redo the type-check of other programs
During this step we collect information about which programs depend on which programs
A program that uses an overload of a public function depends on a program that defines this overload
A program that uses a class, depends on the program that defines it. Places to check: return types, parameter types, variable types, the new operator, types of fields in a class and the parent of a class. Did I forget anything?
If two programs define a public class with the same name, both are a syntax error
If two programs define identical overloads of the same public function, both are a syntax error
Note: this step may add or remove dependencies from programs that have already been compiled
3) If a dependency failed to compile, it is still counted as a dependency. All programs that depend on it fail to compile too
4) A running program is stopped if any of its (transitive) dependencies were edited, added or removed. This is needed because there is no general way to continue running a program it has been modified in the middle of execution. This should fix bugs like colobot/colobot#1628 and colobot/colobot#1494
The next feature I intend to work on is making the static type checker more robust when dealing with public functions and classes. I hope to figure out a way to split this into multiple incremental pull requests for ease of review.
Open this to see the rationale behind this behaviour
--- We have a statically typed language. Due to the language having public functions and public classes type checking of one program may depend on another. A scenario: 1. Program A defines a public function Foo() 2. Program B Uses the function 3. Program A is edited in such a way that if the edit was made before compiling B, B would fail to compile What should happen when we run B? Options: 1) We can't run B: syntax error 2) B becomes frozen in time when it is compiled - it always uses a snapshot of A before the edit 3) We can run B. It uses the updated version of Foo() if it can. In the above scenario it can't. It falls back to using a saved snapshot of Foo() 4) We can run B, but calling Foo() throws a run-time error 5) We can run B and can call Foo(). It will execute the edited version of the function (if a suitable overload is available). The type of the return value may be wrong. If the type of a value at run-time differs from the type it had at compile-time and we try to use an operation on it that was allowed by its compile-time type, but is not allowed by its run-time type, we throw a run-time error (e.g. dividing `string` by `point` throws an error). The type `void` allows no operations. Any missing functions become no-ops that return `void`. If the number or types of arguments are incorrect, this is a missing overload - same as a missing function. 6) We can run B and can call Foo(). It will execute the edited version of the function (if available). Instead of throwing run-time errors any illegal operations produce a value with the run-time type `void`. We somehow define what should happen when `void` is used in the condition of an if statement or a while loop. * 6 is the worst for debugging. Some code will keep working. Some will mysteriously not work. There will be no errors to help the players figure out why. There is a risk of incorrectly typed values being written to the shared state (such as static class fields) causing endless debugging fun ;) * 5 is worse than 4. Firstly it's worse because it leaves the player to scratch her head if the number or types of arguments are incorrect "I am calling the function. Why is nothing happening?". Secondly it's worse because we delay the run-time error until the return value is used. This may leave the players puzzled about where the incorrectly typed value came from. * 4 will waste player's time. The player still has to fix the error, but she will only find out about it when B calls `Foo()`. This may be several minutes or even half an hour after the program starts running. * 3 is worse than 2: it makes players think they can edit A and the changes will affect B, but sometimes it silently breaks this expectation. * 2 will cause confusion. Players will often forget to recompile B and be confused why their changes to A do nothing * I think the option 1 "fail fast" is the most user-friendly and will cause the least amount of confusion. ---This will require refactoring how the compiler works:
When a program is created, deleted or edited: 1) We collect:
new
operator, types of fields in a class and the parent of a class. Did I forget anything?