Open Ovid opened 4 years ago
I generally agree with @tommybutler
We can create type aliases, so the long-hand and short-hand type names will be the same.
RPerl type names are purposefully expressive AKA long-hand:
my boolean $foo = 0;
my unsigned_integer $bar = 23;
my integer $bat = -23;
my number $bax = -23.4;
my character $baz = 'a';
my string $buz = 'howdy';
We can create aliases for those who feel compelled to save a few characters of typing, short-hand type names:
my bool $foo = 0;
my uint $bar = 23;
my int $bat = -23;
my num $bax = -23.4;
my char $baz = 'a';
my str $buz = 'howdy';
Just to be clear, while I like the my $foo :isa(Int);
syntax, I'm not wedded to it. If we create a sound type system and it involves a different syntax that the community is happy with, I'm OK with that. The main reason I prefer that syntax is because it gives us tremendous consistency with other "adjectives" which modify the noun variable. For example:
has $cutoff :isa(PositiveInt) :reader :builder;
In the above, everything which modifies $cutoff
goes after the variable and has a more or less consistent syntax. Further, it seem (to me) to have the potential to be more extensible as we figure out new kinds of types we can drop into :isa(...)
(perhaps via something like use My::Types qw(...);
.
Again, that's a preference, not an insistence.
Aside from the "overly punctuated" syntax, I also tend to prefer postfix type declarations the way Go does them. This is for the same reason that I find French to sometimes be more understandable due to usually putting less important information (adjectives) after the more important information (nouns). Linguistically, these are called post-positive adjectives.
For example, if we're talking about the "fast, red, old, wet car", we have to keep the ideas of "fast, red, old, and wet" in our head so that when we hear the noun, "car", we can map those adjectives correctly, if we were talking about a "fast, red, old, wet cat", the meaning of those adjectives would all subtly change once we hit the noun, but if the noun's first, we can easily put the adjectives in context: "the cat fast, red, old, wet". It's not something English speakers are as used to, but in romance languages, putting the noun first provides great clarity. This is because we instantly understand the context and then can map the variations of the context easily. If you're less familiar with English, trying to understand all of the adjectives before the noun can be a struggle. But I may be overthinking this.
Again, I'm not wedded to this idea. That's a preference, not a mandate (remember, I have zero authority here).
I vote for the cleaner syntax:
my int $foo;
It is more intuitive and concise than the "overly punctuated" syntax:
my $foo :isa(Int);
While having the type on the left of the variable is common practice in many languages, and may work best for Perl, I generally prefer having it on the right on certain basis of consistency. The consistency relates to pairing, having name+type pairs and name+value pairs, the name is consistently on the left, and the extra stuff with it is on the right. A common important example of this is routine parameter declarations and named routine arguments mirror each other structurally. While I wouldn't call it a well-designed language, this consistency is something I like about SQL.
I vote for the cleaner syntax:
my int $foo;
It is more intuitive and concise than the "overly punctuated" syntax:
my $foo :isa(Int);
That may be more intuitive for simple types, but I think it scales less well for complex types. I'm minded (perhaps unfairly) of the delights of function pointer types in C. A sufficiently complex type may well want to span multiple lines (eg if some aspect of it must be a reference to a sub with a complex signature), and there would be distinct value in having the variable name first for such situations.
Of course a sane developer would declare such a type up front to give it a shorter name, but I think the underlying point remains - put the arbitrarily complex thing last, even if you expect it to be simple in almost all cases.
English is inconsistent, for example: The wet cat ran fast.
It's a lot of fun. Anyway.
Semantically, where the type lays also implies what its doing.
my $foo = 5:u32
Seems to imply that $foo is generic and 5 is being cast as an u32. Which would hardly be different the following...
my $foo = "5";
my $foo = 5;
Which most of the time you wont notice the difference with, unless you are using placeholders in DBD::Oracle that looks to what perl thinks the variable is internally as the hint at what to tell Oracle the type is. In the above cast you end up having to do something like 0+$foo to force it to integer so queries dont randomly break.
To that point, when the variable is declared as a typed container it's implied that perl has to coerce or reject the value:
int $foo = 5;
dbl $foo = 5; # should perl die or quietly make this a double?
dbl $foo = 5:dbl; # ok now we are 100% certain this is what you want.
Anyway so that's worth touching on, and that hungarian notation might be the newest policy recommendation in PBP...
Returning to the fondness for my()
, to remain perlish the syntax needs to have version with and without sigils.
my int $foo = 5;
and
my(int($foo)) = 5; # ?
See also Lexical::Types.
@apparluk You seem to have some spam in one of your replies. Can you edit that, please?
Are there any update?
Are there any update?
Not so far. I believe a reasonable summary of the situation could be "Sure; something like this would be good. Please provide details and implement it". :)
@leonerd At this point, I've been juggling too much and rather drop some balls, I've put some down. As a result, Oshun seemed like one to put down because I believe you mentioned you were going to take a stab at something. If I'm wrong, my apologies.
If I remember correctly, then you'll take that stab. Or, if you like, I can write up a subset of what was proposed for Oshun and present it as a PPC so we have a place to start having structure. I'd try to write it with the syntax you seem to favor. I'm not bothered by which so long as something happens. Should I spend some time writing up the PPC, or will you go ahead and work on this, or is another approach warranted?
@Ovid My current plan is
:Checked
variant for method/func signatures. Module name as yet undecidedcall_sv()
This will be a somewhat contentious issue and I'll be a bit pedantic at times for those reading this ticket but don't understand all of the issues involved (that includes myself). In particular, I'm going to give a long, rambling justification which I'm sure P5P doesn't need, but it's here to give background to everyone else reading this.
TL;DR
We need to to standardize our type syntax and semantics.
(And yes,
TL;DR
s need to be at the top of documents, not the bottom)Typed Signatures
Dave Mitchell has been doing awesome work on subroutine signatures, something that is long overdue in the language (I honestly expected them as part of the Perl 6 project back in 2000).
Part of his proposal deals with types in signatures. The proposal is impressive and, from the synopsis, we have this:
Interestingly, the very first response starts with this:
There are a number of interesting comments about the proposal, but I want to focus on my primary concern: "type checking purposes outside signatures".
Long, Rambling Justification
When I work on large systems, one of the most frequent bugs I encounter is when data X is passed from
foo()
tobar()
tobaz()
toquux()
and whilequux()
was expecting an integer, it received something that was not an integer.If I'm very lucky, the code dies a horrible death and I have to walk back through the call chain to figure out exactly where the bad data originated and frankly, I'd rather rip out my intestines with a fork than have to do that again.
If I'm really unlucky, however, the code doesn't die. Instead, it just silently gives terribly bad, wrong, no good rubbish. And no warning at all that something has gone wrong.
Oh, that's not good. So let me validate my argument with a regex!
Ah, so I need to be more careful.
OK, that's better. Finally I'm safe.
I don't even know what
꤆
is (Google tells me it's part of the Paris metro line, but I'm a wee bit skeptical on that), but I know I forgot the/a
switch on my regex. And I'll bet most casual Perl developers don't know about the/a
switch and I know for a fact that most large systems don't try to validate their types because it's a pain, it's more grunt work, and it's fraught with error. Or as I like to say "It'̸s ̴a pai̶n, ͝it͏'͠s͢ ̛m͞o҉ré grun͝t̕ ̴w̛ork, a̸nd ̧it's͘ f́ra̛ught wi̧th ̴er̕r̴o̷r̵."So I applaud David's work, but then there's Cor.
Cor Types
As many of you know, Cor is intended to be the new object system for the Perl core. If you don't want to wade through the wiki, you can watch this talk I gave on Cor.
One thing I briefly touched on and didn't get in to, is typing. So, here's a pointless Python
Point
class to illuminate this point:As you can see, at the end, I set
x
to the string "foo". How can I prevent that when working on a million-line code base? Well, according to Pythonistas, it's "unpythonic" to validate your arguments. Even Perl developers, largely via the Moo/se family of OO, seem to have grudgingly admitted that yeah, asserting your types isn't such a bad thing.So while Dave Mitchell's been thinking about types in signatures, I've been thinking about them in Cor. Here's the above point class, with almost identical behavior:
Except in the Cor world, calling
$point->x("foo")
would generate a runtime error, just as it would with Moo/se code. In the above, we have "slots" (instance data) declared withhas
. The attributes merely provide sugar for common things we need in OO systems. Thus,:isa(Num)
provides my run time type checking.And that brings me to the next problem.
SYNTAX
Traditionally, we tend to see types defined in front of the variables, such as declaring an integer in C:
int c
. In Dave's proposal, it's after the variable:$c is Int
. In Cor, we havehas $c :isa(Int)
. Of course, there's also this loveliness:But that syntax has been with us for years and is largely ignored and if we want to attach additional semantics to the type (e.g., coercions), having it after the variable instead of before is probably a good idea.
In short, optional typing in Perl is long overdue, it's planned for signatures, it's planned for Cor, and eventually someone will want to write:
But if we have types, we desperately need to ensure that "there's more than one way to do it" doesn't apply (sorry, Perlers!). Because if signatures use one type syntax, Cor uses another, and regular variables possibly use one or the other (heaven forbid we get a third!), then it's going to be a confusing mess that frankly, I don't want to try to deal with.
And then there's this bit from Dave's proposal:
I quite like that, but I'm unsure how I would fit the constraints into Cor's syntax. That being said, Cor provides additional behavior to class slots via attributes and it might be a touch disappointing to have an exception here (but I'd live with it).
SEMANTICS
Syntax is nice, but the meaning of the syntax is important. For example, I think we can agree that for a type system, an integer shouldn't match
꤆
, even if/d/
does. But what doesmy $c :isa(Int) = -7/3;
produce?The above C code compiles without a warning and prints -2. Perl, historically, doesn't do the "integer math" stuff and tries to avoid throwing away information:
This is in sharp contrast to other dynamic languages which often get this spectacularly wrong:
So, what does Perl do with
my $c :isa(Int) = -7/3;
? Should it just throw away the extra data? Should it be an error? Should it be a warning? Should the type be ignored?And I'm not even going to try to figure out a type hierarchy right now, but an Int is a Num while the reverse isn't true. However, we'll need one well-defined and standardized, along with an extension mechanism.