"language versions" are a strawman; real code depends on the runtime

(Disclaimer: I avoided "Rakudo" in the title because this is not another iteration of "we need other compilers than Rakudo".) This issue is a reiteration over JJ Merelo's previous issues https://github.com/Raku/problem-solving/issues/253 https://github.com/Raku/problem-solving/issues/277 in order to show that these are legitimate issues that keep growing over time. I try to present it from a different angle now; not because I think it has an easily plannable solution but in the hope that at least the point gets through.

To the urge to implement proper Rakudo release strategy, it seemed like an easy way out (for me as well) to just say: "released user code just should not depend on Rakudo; it should depend on the language version and Rakudo should implement that".

This sounds good in theory but it doesn't work in practice. Let me elaborate.

Blin: a blessing and a curse

Blin is not the reason why "language versions are a strawman"; it is a great tool to avoid having to face this problem.

The idea behind Blin and its predecessor Toaster is that you can ensure compiler backwards compatibility regarding publicly available user code. However, given that

Rakudo is always subject to the language specification - Roast (supposedly) ensures this
User code defines the language version required for it to work

Blin should not be necessary and it should never catch regressions. If it did, at least one of the two things happened:

the definition of the used language version changed
the user code in question depended on things that aren't subject to the language specification

Both of these things tend to happen, especially the latter. These are anomalies that should be mitigated but it's too easy to just silently push the compiler back to the way the user code expects it to behave.

Things that are "internal" but should be "the language"

Long story short: Roast is very far (possibly hopelessly far) from being exhaustive. Just think about the "tests needed" label in the Rakudo issues, even abandoned, there are loads of issues like that. (You may say that "tests" refers to Rakudo tests, not spectests - but mind you, users barely have means to develop for Rakudo specifically so the vast majority of those issues would turn into Raku issues immediately.)

If you find odd behavior, it's hard to deduce what the right behavior is supposed to be according to Roast - but you can assume that Roast didn't catch it, and it's quite possible that it contains nothing related. There is an even bigger problem with negative statements, ie. what the language prohibits or just considers "gray zone".

Rakudo is concrete software. Raku is abstract and shallow but also very broad. The problem with Roast is not only that the users cannot know where Raku ends and where Rakudo starts but also that the great variety of special cases fosters a lot of unfortunate assumptions. So assigning to hashes works, right? What about object hashes? What if one of the elements is an "object hash"? If we take the language specification seriously, it doesn't say anything about whether you can do something like that or not. Now it does but ironically enough, that still couldn't prevent a serious regression.

So yes, I wouldn't underestimate the mere amount of test cases to get proper coverage, and that wasn't the only breakage related to my own activity: I lobbied for numeric coercions of Dateish which then broke the tests of a lizmat distribution, and transitively App::Rak::Complete.

By the way, this is a good example that it can have painful consequences if users start writing tests that Roast doesn't cover but probably should: it's possible that the code would keep working just fine but the test that seemed commonsensical at that time will break all of a sudden, scaring end users away.

Things that may be rightfully "internal" but also rightfully used

It's hard to define this category when the language specification is so fragile but I'm mostly thinking of metamodel-related stuff. It makes sense that it can cripple the implementation (Rakudo or else) if dynamic reflection-like features are too much set into stone - but it also makes sense that these features are what some people want, for greater profit. This is a peculiar case because here it may even be legitimate that the code depends on the runtime. I think if we scratched the surface more, we would find a lot of stuff like that. What does NativeCall work with, for example? Maybe only MoarVM?

Anyway, the point is that sometimes you would indeed want to know "this will run on a runtime that provides C types and FFI" or "this will run on a certain state of the 6model". Not sure if that should be all forced upon the whole language, though. Another reason to think about "compiler dependency"...

Blin cannot guarantee compiler backwards compatibility*

*nor anything else we have.

This is a banal one: even when ignoring that has been said so far, one may still want to harden the dependency on the runtime, simply because no amount of testing we may perform can guarantee the same level of certainty that the same code will work the same way. Perhaps this is the angle @JJ was originally coming from. Either way, I think this is fair enough. If you know software projects, especially in the field of programming language compilers and runtimes, that don't do some (usually semver-inspired) versioning, I would be curious about the experiences. Node, Deno, (C)Python, PHP, JRE, .NET, GCC, FPC, SWI-Prolog, Erlang OTP - I think you get the idea, all the things I have ever used had compilers and/or runtimes that had versioning. I suppose Perl is like PHP in this regard; the language and the implementation go hand in hand.

I can even vaguely recall something about "home-baked" Rakudo builds for Debian or something - an older version with patches applied to it, rather than the latest upstream.

Regarding the demand to have a certain version of Rakudo (or any other compiler/runtime) with only certain "patches" applied to it, it's not helpful or realistic to say "we just won't break anything, duh"; especially not when it's not clear what qualifies as "anything". Since a "straw" language version based on Roast doesn't achieve much to define it, "anything" can really be anything so that would mean that Rakudo gets stuck in a certain state and after that, nothing that has any effect on possible code, can be changed.

All in all, this is not an easy (neither a simple) issue and it's quite possible there is no straightforward solution to it - but I don't think it will go away just because we put our heads into the sand. In fact, it's likely to only get worse over time, as more code gets written. Besides a solution, even reassuring words about why the situation isn't as bad as it seems, are very welcome.

This is hardly a proposed solution, only a comment about the implications I see.

Extending the language specification to fulfill its purpose seems unavoidable. It seems like a not too "engineer-y" task, a huge task nevertheless, the tests could easily grow by a magnitude or more. The principle is simple: a lot of Rakudo behavior would need to be turned into explicit tests. Having said that, this framework may not even be enough to define rules like "the sum of two Date instances must always be an Int" (just an example), and anyway, how do you test that with a black box? Maybe the whole "we will perform some tests and whatever passes is good to go" approach is helpless. Again, any good references?

Now, in an ideal world, only a very limited subset of user code would still depend on the runtime afterwards. However, that could still include modules that others depend on, thereby "poisoning" the dependency tree.
Also, in a not-so-ideal world, there can still be bugs and accidental breakages, meaning that some of the reasons why one would want to version Rakudo anyway would persist.
For this, I can only say, it's probably not as bad as it seems at first. Currently, CORE-c-d-e stuff is mostly a clever trick to achieve just that: possibly breaking changes go there, even if they never contradicted Roast. Now imagine that a versioned Rakudo release is only created once a year, or even less often; the frequency being comparable to the language versions that were abused for this effect. Right now, if you want to be "safe in production", you have to use v6.d which came out in 2018. Ever since, your code has to be bug compatible, even if you are writing your code in 2023. If there was a versioned Rakudo release every year, or just every two years, you could have a better-functioning tool to lock to. In this situation, to impose bug compatibility on the notion of the language rather than Rakudo is essentially like only releasing Rakudo in 2018 and then, say, 2025, offering people in 2023 the choice between the good old Rakudo Perl 6 and the bleeding edge.

Anyway, in a world where people would only opt in to depend on a Rakudo version, it would be much easier to say "if you are not doing anything shady, you will have a much easier time depending on a Raku version".

I've always had problem with this, roast is supposed to be what defines Raku, but any change in Rakudo may break your code(which may even be correct according to roast), it can be caused by new features, experimental features and bugs and well meta programming is its own thing.

There are features added to Rakudo which were not part of Raku, some of them may not have tests and others' test have been written after it was added to Rakudo. Bug fixes shouldn't be taken lightly either.

So what this means is that you can write use v6.c(which was released 8 years ago) and also write some code which uses new things added to Rakudo(and maybe added to roast as well!) and your code works! Why would a code which was added recently work with a Raku version which was released years ago? So If someone sees a feature in roast and writes it using use v6.c and uses a Rakudo which implements Raku v6.c, then runs their code, it won't work, and the answer to the user will be "update your Rakudo".

I haven't spent much time thinking about it, I'm just making a quick comment, but first thought that has come to my mind is that changes(new/experimental features, bug fixes) should be defined in roast first, then Rakudo/implementations can ... well implement it. So I think this means all new roast tests need to be placed in new versions and I think this would also mean more frequent releases of Raku will be needed.

This may seem hard to do or impossible, but I think it would be better if we become more strict and require 3 things:

Write tests in roast and always put them under new versions
Write documentations for it, mentioning the version
Then language implementations are allowed to implement it within the specified version(of course, we can't enforce it!)

I fixed a link and a thinko on the way.

@CIAvash it seems like you are focusing on the first point of the anomalies spotted by Blin: the point of the definition of the used language version changed. This is also a pretty important point and I wonder what would happen if different releases of Rakudo weren't only benchmarked for performance every now and then but also version compliance... it could easily happen that something that implemented v6.c according to 2015, doesn't implement it according to 2023.

I think this is actually the easier part. I suppose it's safe to assume that Roast didn't have any breaking changes for a given version, that is, what passes today, would have passed in 2015 as well. Declaring "one should never change Roast for a published version" is a simple matter of discipline, so to speak.

The problem is that virtually all code released in production so far has already been damned by the other point: the user code in question depended on things that aren't subject to the language specification. Say, you write code in perfect accordance with "language version v6.c, as it was released in 2015". What does it say about your code? How will pairs compare with smartmatch? What about the truth value of empty ranges? What can you expect from type objects when you call a certain method on them? What does a smartmatch do with a regex quoting structure? What utilities may/may not have the WhateverStar act as the identity function?

Basically, since Rakudo is not allowed to regress on Roast (even if it can ignore some parts), every time there is a bug in the core, we actually have no legitimacy to say it was a bug, because a compiler that behaves that way can pass the specification of the language! We simply have a commonsensical understanding of what we would wish Raku to be, and push towards that - except when Blin shows that somebody already depends on that behavior.

So I think the situation is actually much worse than the language shapeshifting.

PS: it's also interesting that the default used version is "latest supported", not "earliest supported". The default itself is "offensive", and then we can resort to user-blaming... that could be another issue.

Good point. But I don't think we can do anything about the past, so we should focus on fixing the problem from now on. I still think the solution I mentioned is good, with a few short and clear rules and discipline, it can be done. But since I have not been involved in writing/contributing to an implementation, I can only make suggestions and ask questions. Can an implementation only implement the parts that are clearly defined in Roast and give an error in cases Roast has not defined them? And then when it's decided what should be done about the undefined behaviors, they should be added to the Roast under a new version, then implementations implement it. In other words do not write generic code, but be very specific about the behavior.

So I see three cases, unless I missed or misunderstood something:

If a behavior was defined in Roast, but an implementation did something else: This must not happen because the behavior was defined and the implementation should not have passed the tests, so it's not implementing Raku
If a behavior is not defined, implementations should refrain from implementing it
If a behavior is not fully defined, implementations should only implement what is defined and do not guess how it should behave

Again these are just thoughts of someone who has not been involved in writing an implementation.

I also agree that the default version shouldn't be the latest, this forces users to specify what actual version they want.

A not completely related topic, if Raku is released (more often) with the changelog, then implementations' changelogs should only mention implementation specific changes and if they implemented a new Raku version, they should just mention that in their release, e.g. "Implements Raku version X", instead of listing Raku things it implemented, in other words it shouldn't partially implement a Raku version.

I'm going off the rail here, but another thing that I think was mentioned somewhere(don't remember where), if we are in a project(there is a META6.json), Raku should enforce the versions specified there(version of Raku and dependencies), unless a specific version is specified in the code(which should be compatible with META6.json). If I remember correctly, that is not the case now. I think if a version wasn't specified for a dependency, it should be an error and the project and the dependencies shouldn't be installed.

As things stand, Roast which is the sole authoritative "document" of the Raku language, is really just a best-effort collection of tests that either run or don't, and if they run, they either pass or don't.

I think it's a decent mental framework if you think of Rakudo and Roast as any sort of distribution you want to publish and its corresponding tests. (In reality, the situation is a bit worse because Rakudo is not all implemented in the terms of Raku itself but let's use this simplification.)

On one hand, this should answer your question about the feasibility of tying Rakudo internals to Roast tests - it's all black-box testing, there are no practical means (that is, other than some tremendous fudging effort based on someone's own knowledge of Rakudo, and then trying to keep that in sync with Roast) to reflect on underlying behavior. User code goes in and something happens, we can decide whether that something that happened is nice or not.

On the other hand, this in itself should raise some questions about the feasibility of the basic idea that Roast can be the sole authoritative document, itself.

If you released your code as a distribution, would you go ahead and claim that the tests you attached define the interface of your distribution? Do you think there is a sufficient amount of test cases that could define an interface, especially if all your tests are subject to the same limitations as the implementation itself, because it's all written in the same language?

Actually, the last point is a big one. Raku might be capable of implementing Raku but is it capable of defining it? You would want to mandate banal stuff like "the result of certain operation should be of type X, or else it should only ever throw exception Y". How do you describe that in terms of Raku? And mind you, this is much more trivial than defining smartmatching or list assignment for all input ever imaginable.

In fact, what I proposed earlier for extending Roast is not much different from some tremendous fudging effort based on someone's own knowledge of Rakudo, and then trying to keep that in sync with Roast. We would have to go through as much Rakudo code as possible, decide whether what it does is logically right, and manually create tests from the logical branches, thereby basically modeling the language after the only working implementation. This is still based on the premise that Roast would stay a bunch of "rakutest" files, therefore it wouldn't have to mean that it ties the language to Rakudo. Only that all user space behavior of Raku would be based on the only user space behavior they have experienced, thanks to Rakudo.

By now, I have talked so much about the unfeasibility of current Roast - which is the most universal issue in the whole topic anyway, in my opinion - that I feel obliged to provide another way out: let's just acknowledge that there is no real language standard and all code people have been writing depends on Rakudo releases. This is true either way. If we have a strategy to map "language versions" of previous code to an actual Rakudo "reference implementation", then at least we don't fool ourselves and users that you can continuously upgrade Rakudo and none of your code will ever break.

One more point, also related to what @CIAvash said:

https://github.com/finanalyst/raku-pod-render/issues/25

There are a lot of possible interpretations of the situation. One could say that the distribution used something that isn't spec - while this is true, it's not particularly helpful when the spec really doesn't contain a lot of things. Now we either just push users to update, or the author can either back off, or somehow promote that certain versions of the dist work with certain versions of Rakudo. If dependency on Rakudo could be expressed at least, the dependency manager could save users the trouble to retrieve and fail to install a version that doesn't work with their Rakudo.

Raku / problem-solving