Support for variables with multiple types

google / blockly

The web-based visual programming editor.

https://developers.google.com/blockly/

Apache License 2.0

12.43k stars 3.71k forks source link

Support for variables with multiple types #1997

Open jollytoad opened 6 years ago

jollytoad commented 6 years ago

Problem statement

Blocks can define multiple types on an output connection, this can be used to indicate the returned type is the union or intersection of two types, which is certainly possible in many of the target languages, it can also be used to indicate that a return type could be coerced into one of the types or to simulate inheritance.

Unfortunately though variables don't currently share this ability, so a variable block can only return a single type.

Expected Behavior

Variable blocks should have the same ability as other blocks to return multiple types.

Proposal

Allow an array of strings to be passed as the type of the variable, and adapt the VariableMap to cope with this in a backwards compatible manner.

This is a feature I need atm, so I'll be hacking the variable model & map as a proof of concept. If this is something that would be generally desirable in Blockly, I'd be happy to produce a full PR.

jollytoad commented 6 years ago

I've rolled an initial implementation of this, as overrides (rather than a fork atm), which can be found in this gist... https://gist.github.com/jollytoad/7a7df5272aa7d767463fcb391a215f86

NOTE: That gist contains an additional extension to allow changing of a variable type - which I also need - but for the sake of this specific enhancement I'd be happy to not included this in the PR if you'd prefer not to support that feature.

RoboErikG commented 6 years ago

I don't think I understand why this needs to go into the variable model. The output type of a block is not connected to the type of the variable field except for the dynamic variable block, and you could get this feature on the dynamic variable block with a map of compatible types and modifying the mutator to set all the compatible types instead of just the variable type on its output.

What's your use case for this?

jollytoad commented 6 years ago

We are generating Java code using our own code generation (implemented in Java), so we are very strictly typed, and we're attempting to enforce this type safety in our block connections.

In the generated code we also allow conversions between certain types, eg. converting a String to a 'Thing' by looking up the Thing by name. So often data can be passed around as multiple types, eg. String or Thing - and the block connection checks support this fantastically.

But sometimes we need to put these into variables too, and we still want the variable to be of String or Thing type - so it make sense to me to record this fact as the most fundamental level, the VariableModel.

I've also have to implement our own blocks and extensions to handle 'lists' of things, so that the type of the iteratee variable represents the type of the list items on it's input - and therefore also need to change the type of variables too, for which i've added an extension method in the gist above.

The changes I've made for this are actually fairly simple, and totally backwards compatible - anyone not interest in this feature could completely ignore it.

Admittedly though, my solution above isn't totally complete yet - still seeing duplicate variables listed in the dropdown - i should hopefully get round to fixing that soon.

It's quite hard to explain what we actually doing, but a couple of my team colleagues will be attending the Blockly summit in September, and would love to demo our work to the Blockly team.

RoboErikG commented 6 years ago

First, I think this will take some more thought on our side before we'd accept changes to Blockly. I'm going to leave this issue open for discussion and to see if there's additional interest.

I'm happy to discuss your specific implementation at the summit, but if you don't mind going into a little detail here it might be nice for others to have an idea how you're using it.

My biggest question is how do you make that work with a generator? It seems you'd have to pick one type at generation, but I'm not sure how you'd reliably pick the correct type without a hierarchy of types.

And while Blockly doesn't use the type at generation in any of the supported languages, other developers do so this won't be backwards compatible as it currently is. Two other options:

1) Extend the setType API to take a type + an optional list of all types and add a new method to return the set of all types. In VariableMap, instead of storing variables multiple times have a map of compatible types for each type and add a method for returning all compatible variables that takes the initial type and returns the list of variables of that type + all variable lists of types that are directly compatible.

2) Create a VariableTypeMap that stores a mapping of type transformations and add a call to it everywhere you need the full list of types.

I think the reason you have double variables is because when the variable dropdown includes multiple types it gets the list of variables for each type and adds them. There are a few places in the code that assume variables only have a single type and appear once in the map.

Mutators might still work for your case, though I can see how it'd get complicated. You'd have to have the mutator for the list iterator recreate the variable as the most specific type that applies to all items in the list. The dynamic variable mutator could also get complicated depending on how much type safety you want. I could see something like:

1) Change the output to be the type of the selected variable and all types it can cast to.

2) When disconnected allow all variable types (or a set of types if using multiple variable blocks).

3) When connected, allow all variable types that match the connection or can cast to that connection type.

jollytoad commented 6 years ago

Our code generation first of all has a mapping from a Blockly type string to a real Java class, and we use a convention that the first type in a list of types is the 'primary' type, and starting there we attempt to resolve each type in the list to a real Java type, the one found wins - which is usually the primary type. This is the type used in the generated code, for a variable for example. When we see a mismatch between types on a connection or cannot determine a type we generate a call to a converter service with which we have registered converters.

On a side note: The xml is parsed using Jackson 2 to Java objects, and code is generated using JavaPoet - we don't do any code gen in JS.

duzc2 commented 6 years ago

it is not a general requirement. blockly are not designed to typed language only. also Java is not in official support.

@jollytoad you may replace the type check function of blockly.

jonnybot0 commented 6 years ago

@duzc2 - I would argue that this is a general requirement. Support for this would in no way require users to target a typed language, but it would improve support for typed languages. More deeply, it provides an effective extension point for developers (like us) who are using block types to validate connections and provide guidance to users.

For our use case, rather than a hierarchy of types, blocks have a flat network of convertible types. So, if a given input requires an object of Type X and the variable is of Type Y, but Type Y can be cast to X (even though it's not hierarchically "under" X), we can allow a variable of Type Y to be used in a Type X input. Otherwise, our generator can throw a type error.

The deeper need is for variable types to match the more general Blockly type model that allows multiple types on a given block. That contributes to the Lego-like aspect of the project.

Even dynamic, loosely typed languages like JavaScript (not to mention its typed kindred like TypeScript or Dart) could benefit from this on some level. For instance, you might be willing to accept JavaScript's behaviour of turning certain types into strings (numbers & arrays are pretty sensible), but not its conversion for arbitrary Objects, or primitives like null or undefined. By expressing all of the types a variable could be, you can use Blockly to generate code that's less likely to fail at runtime.

As for Java not being supported, our use of Java is really a footnote here.

@RoboErikG - I'll be at Blockly Summit tomorrow; it may help just to see what we're doing. I'd hazard that the developers who are using the type information currently would be more pleased than annoyed to let variables have multiple types, though I could be wrong. Most likely it depends on whether they're working around the issue (as we are) or relying on it. That said, it seems to me that @jollytoad's implementation wouldn't affect those relying on it much, as they could go on happily with their single-typed variables without any breakage. If they wanted to start using multi-typed variables, they would have to adjust their generators, sure, but in that case they need this feature (or one very like it) anyway.

duzc2 commented 6 years ago

@jonnybot0 Execuse my poor English. I use Blockly for my C# project. I just hack the blocky by replace

Blockly.Connection.prototype.checkConnection_ = function(target){ ... };

to make the Blockly support C# type . I send the information to C# , and let the C# checks if the types match.

And I added some other features via this way.

if we implement this feature in Blockly , we should think about Class/Interface/Single inheritance/Multiple inheritance , and even more than OOP .then ,list all compatible types in the check_ list.Even much more language feature should be think about. But it's not a responsibility of blockly. Blockly is a block-display front end , with a simple text-based code generator . I made my own generator for my projects . And some of my projects do not use any code generator , I create information by the blocks directly. I think the block front end is the most important part of Blockly , and there are many way to use it ,

if you have good checkConnection implemention for Java , you may open the source .

Sorry for my poor English again ...

jollytoad commented 6 years ago

There could be some confusion caused here with the direction this conversion is heading, this issue really has nothing to do with code generation per se, or how languages model their types (eg. inheritance), it's all about the variable support in the UI, and the inconsistency between the limited single type checking on variable blocks compared to multiple type checking on any other custom block.

I've provided a gist, linked above to our implementation of this - we are avoiding forking Blockly as far as we can and just monkey patching over functions we want to change.

If there is genuine interest in supporting this within Blockly itself, I'd be more than happy to submit a full PR - but my time is constrained, and I don't want to have to do this unless there is a good chance of it being accepted.

rachel-fenichel commented 6 years ago

Quick followup--this hasn't fallen off my radar, and I'm working through the gist and making sure I understand what's going on. Sorry for the delay--things got hectic around the conference, and it took a bit of time to get everything cleaned up again.

rachel-fenichel commented 6 years ago

For our use case, rather than a hierarchy of types, blocks have a flat network of convertible types. So, if a given input requires an object of Type X and the variable is of Type Y, but Type Y can be cast to X (even though it's not hierarchically "under" X), we can allow a variable of Type Y to be used in a Type X input. Otherwise, our generator can throw a type error. The deeper need is for variable types to match the more general Blockly type model that allows multiple types on a given block. That contributes to the Lego-like aspect of the project. Even dynamic, loosely typed languages like JavaScript (not to mention its typed kindred like TypeScript or Dart) could benefit from this on some level. For instance, you might be willing to accept JavaScript's behaviour of turning certain types into strings (numbers & arrays are pretty sensible), but not its conversion for arbitrary Objects, or primitives like null or undefined. By expressing all of the types a variable could be, you can use Blockly to generate code that's less likely to fail at runtime.

Is the primary issue that you need the variables themselves to hold multiple types, or just that the connections on the variable blocks need to be able to have multiple types?

If it's the latter, what you need is to be able to talk about the ~capabilities (I'm being vague with this term) of each variable type (e.g. this one is iterable and stringable, this one can be either a String or a Thing and we'll deal with it later, etc.)

I would do that by changing the variable blocks to do a lookup when setting the types on the output, where there's a map of types to lists of ~capabilities. Then the connection checks are updated to match the list of ~capabilities whenever the variable changes. Would that solve your problems?

I prefer that solution because it leaves the variable model alone, while being easy to implement as a library option or just in your own blocks.

If it's the former, I'm still not understanding something about your use case.

jollytoad commented 6 years ago

Tbh, I'm not sure I see a distinction there.

I'm not sure why type mapping/lookups keeps cropping up - that surely just overcomplicates things?

I'm looking at this purely from an inconsistency in the type checking of the Blockly UI. Input and output connections can have a set of types, and any output can be connected to any input if the sets of types intersect. And the code generator can use this information as it sees fit.

As it stands, if i connect an output to a variable, and then the variable to another input I lose this richness of information and the intersecting types. I'm forced to declare up front that the variable is capable of receiving and producing a single type.

I should be able to connect an output to a variable input and retain all of the types from the output so that it can later be connected to an input and have the same type intersection check as if i'd connected the two blocks together directly without the variable. Although I realise this is type inference, which i've raised separately - but this multi-type support is a fundamental requirement of that.

We also have blocks that have variable fields, that do something and place it into a variable - at the same time declaring the type of that variable - and we want that variable to have the possibility of this type intersection behaviour when later used - as would be possible if we just had an output connection instead and forced users to connect the block directly to another input.

I realise this whole thing may be quite specialised for our case, and appreciate you may not want this in the core - but in general the variable model works for us, and this is actually a fairly minor tweak - and could easily be implemented as a external enhanced given the necessary hook in points to the variable model API. I'd be more than happy to approach it from this direction and suggest additional API improvements to the model if you'd be up for that?

RoboErikG commented 6 years ago

Type mappings/lookups keep cropping up because we can't find a use case that requires it be added to the core APIs for typed variables. Since your use case seems to be unique and solvable with existing APIs we're trying to understand why it would need an API change.

Regarding the difference between connection type checks and variable types, there is nothing explicitly linking a variable field on a block to the output or any input on its block. There are lots of cases of blocks that use a variable dropdown and it has no effect on any of that blocks' connections. It is only the case of the variables_set_dynamic and variables_get_dynamic blocks that we've added custom handling to link the variable type to a connection on the block, and extending that to map to a list of types is not very complicated and requires no API changes. They are also used in very different ways. The type check is identifying all connections that may be made. The type on a variable is identifying the type for the variable.

In addition, in most programming languages multiple types on a variable doesn't make sense either. Variables are generally of one type with inheritance or casting to treat it as another type. Complicating the model representation makes it more difficult for the majority of developers who aren't doing this to work with the API.

Why does creating custom versions of the variables_get_dynamic and variables_set_dynamic blocks not work for your use case?

rachel-fenichel commented 6 years ago

Allowing multiple types on a single variable vs mapping to connections gives the same result in a world where variable types cannot change after creation.

It is not the same if the variable type can change as the user edits the code, because then you don't know what types might show up on that connection at a later point.

@jollytoad You mentioned (here and on the forum) doing type inference. I think that's where we're getting a mismatch between our suggestion and yours. Can you confirm that?

+1 to what Erik said about not having an explicit connection between a variable field and any connections on a block.

jollytoad commented 6 years ago

Thanks for all the feedback on this, I think i'm starting to grasp your points - although I'm working in a totally different area atm, not Blockly related, so my brain isn't fully in gear on this. I'll need to go and revisit this and have a think, when I get chance. Cheers.

RoboErikG commented 6 years ago

No worries. Take your time on getting back into the right space. When you do we'd be interested in a bit more details on the API changes: basically a list of each change and how backwards compatibility is maintained.

Another option we've been discussing based on Mark's answers on the forum would be an API for changing a variable's type.

The change would include:

Add an overridable method for checking if a variable may have its type changed.
- This method should take in the variable model and a list of current uses of the variable.
- This should return true by default.
- If a developer has UI for changing the type this function would create it.
Add a changeVariableType method for changing the type of a variable.
- This should be written similarly to renameVariable
- This should call the above check function instead of defining the logic inside the function.
- After a successful rename it should fire an event for the variable type changing.
Update variables_get_dynamic and variables_set_dynamic to listen for these events and change their output/input type checks appropriately.

And as a related change we've been discussing making the connection check itself overridable so a developer could perform additional logic on the two connections beyond just checking if they have an overlap in their type check list.

Would that fit the flow you're aiming for in your app? Are there additional cases we're not considering?