google / blockly

The web-based visual programming editor.
https://developers.google.com/blockly/
Apache License 2.0
12.41k stars 3.71k forks source link

Invalid/Unavailable options when deserializing from XML for blocks with dropdowns dependent on children #5787

Open jschanker opened 2 years ago

jschanker commented 2 years ago

Describe the bug

When a block has a dropdown field with options dependent on its child block(s), saving to XML and then deserializing may cause (A) the originally selected option to no longer be selected with an incorrect warning about the option being unavailable. (B) Alternatively, it might allow modification of the XML so that an invalid option is selected without warning.

To Reproduce

Steps to reproduce the behavior:

  1. Go to the Blockly Playground at http://google.github.io/blockly/tests/playground.html.

2A. (For unavailable option) Open the Developer console and enter the following block definition:

Blockly.Blocks['test_dyn_field_root_block_type'] = {
  init: function() {
    this.appendDummyInput()
        .appendField(new Blockly.FieldDropdown(
            () => this.getDescendants().map((_, i) => 
            [`Select Descendant # ${i}`, i.toString()]),
            i => this.getDescendants(true)[i].select()),
            "SELECT");
    this.appendStatementInput("DESCENDANTS");
    this.setColour(230);
    this.setTooltip("Attach/remove blocks to update the dropdown.");
  }
};

3A. (For unavailable option) Load the following XML:

<xml xmlns="https://developers.google.com/blockly/xml">
  <block type="test_dyn_field_root_block_type" x="113" y="113">
    <field name="SELECT">1</field>
    <statement name="DESCENDANTS">
      <block type="text_print">
        <value name="TEXT">
          <block type="text">
            <field name="TEXT">A</field>
          </block>
        </value>
        <next>
          <block type="text_print">
            <value name="TEXT">
              <block type="text">
                <field name="TEXT">B</field>
              </block>
            </value>
          </block>
        </next>
      </block>
    </statement>
  </block>
</xml>

4A. (For unavailable option) Note that Select Descendant 1 has not been selected and the warning: Cannot set the dropdown's value to an unavailable option. Block type: test_dyn_field_root_block_type, Field name: SELECT, Value: 1.

2B. (For invalid selected option) Open the Developer console and enter the following block definition:

Blockly.Blocks['test_block_with_at_most_10_descendants'] = {
  init: function() {
    this.appendDummyInput()
        .appendField(new Blockly.FieldDropdown(
            [...Array(10+1).keys()].map(i => [`Add ${i} more descendants`, i.toString()]),
            (i) => (parseInt(i) <= 10 - this.getDescendants().length + 1) ? i : '0'),
            "ADD");
    this.appendStatementInput("DESCENDANTS")
        .setCheck('child');
    this.setColour(230);
    this.setTooltip("Attach/remove blocks to update the dropdown.");
    this.onchange = e => {
      if (e instanceof Blockly.Events.BlockMove) {
        // validate new value
        this.setFieldValue(this.getFieldValue('ADD'), 'ADD');
      }
    }
  }
};

Blockly.Blocks['child'] = {
  init: function() {
    this.appendDummyInput()
        .appendField('Child');
    this.setPreviousStatement(true, 'child');
    this.setNextStatement(true, 'child');
    this.setColour(230);
    this.setTooltip("Attach/remove blocks to update the dropdown.");
  }
};

3B. (For invalid selected option) Load the following XML:

<xml xmlns="https://developers.google.com/blockly/xml">
  <block type="test_block_with_at_most_10_descendants" x="0" y="0">
    <field name="ADD">5</field>
    <statement name="DESCENDANTS">
      <block type="child">
        <next>
          <block type="child">
            <next>
              <block type="child">
                <next>
                  <block type="child">
                    <next>
                      <block type="child">
                        <next>
                          <block type="child">
                            <next>
                              <block type="child"></block>
                            </next>
                          </block>
                        </next>
                      </block>
                    </next>
                  </block>
                </next>
              </block>
            </next>
          </block>
        </next>
      </block>
    </statement>
  </block>
</xml>

4B. (For invalid selected option): Open the Developer console and enter the following (or just move the block):

workspace.getAllBlocks().find(block=>block.type === 'test_block_with_at_most_10_descendants')
    .setFieldValue('5', 'ADD');

Note the validator disallows this and resets the value to 0.

Expected behavior

The selected field value saved during serialization should be selected upon deserialization, and the XML should not be able to be modified to produce invalid field values upon deserialization.

Screenshots

Unavailable option: screenshot (5)

Invalid option: screenshot (19)

Stack Traces (Unavailable option)

field_dropdown.js:534 Cannot set the dropdown's value to an unavailable option. Block type: test_dyn_field_root_block_type, Field name: SELECT, Value: 1
FieldDropdown.doClassValidation_    @   field_dropdown.js:534
Field.setValue  @   field.js:932
FieldDropdown.fromXml   @   field_dropdown.js:175
domToField  @   xml.js:1015
applyFieldTagNodes  @   xml.js:809
domToBlockHeadless  @   xml.js:951
domToBlock  @   xml.js:600
domToWorkspace  @   xml.js:459
load    @   playground.html:182
onclick @   playground.html:471

Additional context

The reason for the bug is documented by Beka here: https://github.com/google/blockly/issues/3638#issuecomment-986043572:

This change made it so that blocks are connected to their parents (but not children) before fields are deserialized. So now dynamic dropdowns can properly access the parent during deserialization.

Once I read about this fix, I of course had to try to break it again by flipping the dependencies. :P

BeksOmega commented 2 years ago

Hiya Jason! Thanks for the issue, but I'm not sure that this one needs a fix. The recommendation documented with the new JSON serializer, and the docs that will be published on the devsite after the release is to store info about connected child blocks in a mutator. Because the mutator is always deserialized before fields, this elliminates the problem of depending on child blocks.

Not an ideal solution because it does require storing duplicate data. But wrt deserialization order, it's hard to have an ideal solution.

Thank you again for the issue! It's always great to have people filing stuff, especially when it's a problem you've managed to suss out through cleverness hehe.

jschanker commented 2 years ago

Hi Beka, Happy to look for these issues. Always fun to try to break things!

I remember having used mutators when I answered my first question on the Blockly forum about dependent dropdowns on the same block. Sufficed to say, I certainly thought it was unintuitive that you could fix the problem of unavailable options by merely calling getOptions on the dropdown to regenerate the options after setting the field value of the dropdown that it depended on. But at least with appropriate documentation, a developer could refer to it when something is not working.

However, what if everything works fine under normal circumstances so that the developer doesn't realize that the XML can be modified so that invalid values can be selected upon deserialization? This method of deserialization seems to allow for this. The example above is a little contrived in that the listener for validating field values only runs on move (and field update) events, but it illustrates a case where the field value will be valid under normal use cases. This example could probably be modified to more natural use cases with some more thought. Given your comment and @rachel-fenichel's comment, I think this (modified) issue is still a problem?

BeksOmega commented 2 years ago

I'm sorry it's been a few months since I've been working on this stuff hehe. It might indeed still be a problem, I'm not sure.

However, what if everything works fine under normal circumstances so that the developer doesn't realize that the XML can be modified so that invalid values can be selected upon deserialization?

I don't think that invalid values can be selected upon deserialization (if we take valid to mean values which are returned by getOptions). Even if the XML is modified directly, it should still conform to this constraint right? I think this is where I'm getting a bit lost :/

jschanker commented 2 years ago

@BeksOmega Sorry, my example of invalidity was unnecessarily confusing. In the invalid example, I essentially used a static list of 11 options (Add x more descendants for x from 0 to 10) so getOptions would always return this list. However, if you tried to select an option where the sum of the number of child blocks and x were greater than 10, it would just reset it to 0. So, in the case where there were 7 attached child blocks, adding 5 would still be an option, but adding 0, 1, and 2 would be the only valid ones.

The point is to use a value that's simultaneously valid for no attached statement blocks so it passes validation when the dropdown is set but is an available but invalid one for when blocks are attached.

BeksOmega commented 2 years ago

The point is to use a value that's simultaneously valid for no attached statement blocks so it passes validation when the dropdown is set but is an available but invalid one for when blocks are attached.

Ah ok I think I'm seeing what you're saying. Let me try to pass the ball back to you: So when child blocks are added to the block, it can make the field value invalid. This can happen when deserializing from XML, because the field value is valid for the no-child-blocks state, but becomes invalid when the child blocks are later attached during deserialization.

So basically, this represents a case where XML deserializes validly, but does not represent a valid state.

Is that what you're saying? Or am I still missing the mark?

I'm not sure how this problem could be fixed in practice... Besides suggesting that the developer add some extra validation to XML if they allow users to write it explicitly. But on the other hand, I'm not sure how many projects actually allow users to directly manipulate the XML.

Do you have any thoughts on fixes?

jschanker commented 2 years ago

Yes, exactly, I think a decent analogy would be a worker who gets an ID card that grants them access to a company office, but then they leave the company and somehow bypass the checks that would normally invalidate their ID after they do. If it's just an ordinary company office with lots of people who already have access, this probably won't matter too much, but it still represents a theoretical security vulnerability. I doubt anyone is simultaneously using Blockly to do something with security considerations and has a setup like this where the validator can be bypassed in a meaningful way, so this might not be too much of an issue in practice.

Despite the title of the related issue I filed, what I'm trying to point out is not so much about the modification of the XML (that just makes it quicker) as it is about the ability to load block-based code from XML that is invalid in its final state when all blocks have been connected and field values set. I think allowing the user to load a block-based program from XML is a somewhat common use case? Or at least, they could open the Developer console and do this? As for relying on developers to take the proper precautions, I think about all the databases still vulnerable to SQL injection despite the extensive documentation about how to protect against these types of attacks. Again, these types of concerns are probably not applicable for most Blockly programs, but I know you had mentioned in your comments about certain approaches to validation being dangerous, so I wasn't exactly sure what types of cases you had in mind?

As for solutions, I unfortunately don't have great ones at the moment. It's definitely easier to break something than it is to fix it :). I have some thoughts, but they're mostly impractical/incomplete at this point. One possibility would be to try to simulate the assembly of blocks firing the appropriate events in an attempt to see if the desired state could be reached through normal use. But simulating the assembly of blocks/field values one by one would probably be too slow and based on dependencies, order of assembly might also matter making this approach even more impractical. Taking a cue from React, another would be to require developers to use validators we could easily reason about or use something like dangerouslySetValidator so at least they would be actively circumventing a security measure instead of passively doing it. Of course, requiring all developers to change the way they set validators when updating their Blockly version to correct for a problem that probably doesn't pertain to them is probably another bad idea.

I'm going to think about this more when I get a chunk of free time, and then I'll write back here if I have some additional suggestions.

BeksOmega commented 2 years ago

I doubt anyone is simultaneously using Blockly to do something with security considerations and has a setup like this where the validator can be bypassed in a meaningful way, so this might not be too much of an issue in practice.

Yeah I'm not sure that actual security concerns are much of an issue, but there's still the principle of the thing. Like, Blockly should not allow malformed code to be loaded. Full stop. (imo) Which is why I really appreciate you bringing up this issue =)

I think allowing the user to load a block-based program from XML is a somewhat common use case?

I think most projects keep the save format hiden from end-users, they just know the project gets saved, not how. However, there is still th epossibility of loading from the dev console (unless devs explicitly take precautions to prevent this).

One possibility would be to try to simulate the assembly of blocks firing the appropriate events in an attempt to see if the desired state could be reached through normal use.

This makes me wonder if maybe the problem is actually with using events to determine which connects between blocks are valid. For example, if a developer is using a connection checker to look for the field to see if the number of children is correct, then this is a non issue. It seems like because (in this hypothetical example) the dev is using events, which are not fired during deserialization, that we get into problems.

Maybe one solution would be making custom connection checking easier to implement? Eg some validator-like system, rather than having to implement a whole custom class.

jschanker commented 2 years ago

@BeksOmega I think your proposed solution about connection checkers will work well in almost all situations, but we can still find some contrived possibilities to break validation again. Modifying my example from https://github.com/google/blockly/issues/5793, suppose now that the validator for the field will only enforce limits on the number of blocks if there's another "strict" block that's moved into the workspace. Then in the case this block is added to the workspace after all the connections to child blocks are made (which will occur when the strict block's XML appears at the end), we can still have the problem of an invalid setup. (See below example with the XML and corresponding code with a hacked together connection checker to add to the console.)

Ultimately, I think the underlying problem that causes issues such as these is that there are no restrictions on the types of functions that can be used for validators, making them impossible to reason about. To me the ideal situation would be requiring field validators to be pure functions, but this is impossible in situations such as these where the field validator needs more than just the new value to determine whether that value is valid.

I think a plugin that facilitates some sort of unified hierarchical validation system with pure validators could be useful. So, as a high level overveiw of a possibility, maybe when an event that requires validation occurs such as changing a field value, connecting a block, etc., the workspace state is frozen and the workspace state is then validated by calling the appropriate block validators, passing the state it needs to complete validation. The block validators would in turn call their field/input/connection validators, etc. again passing the complete set of state required for validation. If all validation is successful, the change in state of the workspace is permitted. Otherwise, the workspace is returned to its previous state. Assuming the workspace is put into an initially valid state as defined by all validators returning non-null values, the workspace is provably in a valid state after a validation pass because validators only depend on workspace state that cannot be changed unless they all agree that this new workspace state is valid.

So, of course this works nicely in theory, but whether this is practical in terms of the amount of time required for validation is another story. And there would obviously be a number of implementation details to consider such as how to handle situations where a value is changed to be valid (e.g., 7a becoming 7 for a number input). What do you think?

And sorry for the delayed response. It was difficult to think of another breaking case and an outline for a potential solution in the midst of finals and grading.

XML:

<xml xmlns="https://developers.google.com/blockly/xml">
  <block type="test_block_with_at_most_num_child_blocks" x="50" y="50">
    <field name="ADD">5</field>
    <statement name="DESCENDANTS">
      <block type="child">
        <next>
          <block type="child">
            <next>
              <block type="child">
                <next>
                  <block type="child">
                    <next>
                      <block type="child">
                        <next>
                          <block type="child">
                            <next>
                              <block type="child"></block>
                            </next>
                          </block>
                        </next>
                      </block>
                    </next>
                  </block>
                </next>
              </block>
            </next>
          </block>
        </next>
      </block>
    </statement>
  </block>
  <block type = "strict" x = "0" y = "0"></block>
</xml>

Console code:

Blockly.Blocks['test_block_with_at_most_num_child_blocks'] = {
  init: function() {
    this.appendDummyInput()
        .appendField('Add up to ')
        .appendField(new Blockly.FieldNumber(0, 0, undefined, undefined, 
            (num) => {
              if (workspace.getBlocksByType('strict').length === 0) return;
              const childBlocks = this.getDescendants().slice(1);
              const deleteNum = childBlocks.length - num;
              for (let i = 0; i < deleteNum; i++) {
                childBlocks[i].dispose(true, true);
              }
            }), 'ADD')
        .appendField(' child blocks.')
    this.appendStatementInput('DESCENDANTS')
        .setCheck('child');
    this.setColour(230);
    this.setTooltip("Attach/remove blocks to update the dropdown.");
    this.onchange = e => {
      const block = e.blockId && workspace.getBlockById(e.blockId);
      if (block && (block.type === 'strict' && e instanceof Blockly.Events.BlockMove)) {
        // validate new value if a strict block was moved into the workspace
        this.setFieldValue(this.getFieldValue('ADD'), 'ADD');
      }
    }
  }
};

Blockly.Blocks['child'] = {
  init: function() {
    this.appendDummyInput()
        .appendField('Child');
    this.setPreviousStatement(true, 'child');
    this.setNextStatement(true, 'child');
    this.setColour(230);
    this.setTooltip("Attach/remove blocks to update the dropdown.");
  }
};

Blockly.Blocks['strict'] = {
  init: function() {
    this.appendDummyInput()
        .appendField('Enforce Descendant Limits');
    this.setColour(230);
    this.setTooltip("Add to enforce descendant limits.");
  }
};

Blockly.ConnectionChecker.prototype.canConnect = function(
    a, b, isDragging, opt_distance) {
  if (workspace.getBlocksByType('strict').length > 0) {
    const previousConnection =
        [a, b].find(x => x.type === Blockly.ConnectionType.PREVIOUS_STATEMENT);
    const nextConnection =
        [a, b].find(x => x.type === Blockly.ConnectionType.NEXT_STATEMENT);

    if (previousConnection && nextConnection) {
      rootBlock = nextConnection.getSourceBlock().getRootBlock();
      childBlock = previousConnection.getSourceBlock();
      if (rootBlock.type === 'test_block_with_at_most_num_child_blocks') {
        const maxDescendants = rootBlock.getFieldValue('ADD') + 1;
        if (rootBlock.getDescendants().length +
            childBlock.getDescendants().length > maxDescendants) {
          return false;
        }
      }
    }
  }

  return this.canConnectWithReason(a, b, isDragging, opt_distance) ===
      Blockly.Connection.CAN_CONNECT;
};
BeksOmega commented 2 years ago

Heya thanks for the response :D Absolutely don't feel bad about finals, students always come first! I hope they did well, and I hope you are having a good winter break! :christmas_tree: :menorah: :star2:

Your example about connections checks depending on other disconnected blocks is really clever. We should get you to write tests for us! You're so good at edge cases dude lol Also, thank you for the sample code =)

I also really like your thoughts about pure validators, I think you're right that that is the general problem. If we have validators which depend on arbitrary state, we can't reason about them/deserialization order.

I think it would be fun to try to create a pure version of Blockly, but like you said I think that would be a lot of work. Not only would you have to create all of the "piping", but you'd also have to decide what counts as "workspace state" and what doesn't. Plus you'd need to decide how you want to handle validators depending on state which is external to Blockly.

Obviously I'm not in charge, but given that it doesn't seem like something the core team is likely to pursue :/ Personally though, I'd still love to see a version of this working. Pure code just makes me happy hehe.

Best wishes and happy holidays! :fireworks:

jschanker commented 2 years ago

Thanks, Beka and happy holidays! I filed a feature request in Blockly Samples where I refined this idea. If it seems viable, maybe I can have some of my students work on it next semester. Thanks again; being able to get feedback from you and the entire awesome Blockly team is always very helpful to me!