Automattic / vip-block-data-api

WordPress plugin that provides an API to retrieve Gutenberg content as structured JSON.
http://wpvip.com
GNU General Public License v3.0
97 stars 7 forks source link

Add recursive graphql format in blocksDataV2 #72

Closed alecgeatches closed 1 month ago

alecgeatches commented 1 month ago

Description

Recently, data type representations were changed in https://github.com/Automattic/vip-block-data-api/pull/66 which modified some odd parsing results when consuming GraphQL:

  • strval(false) results in ""
  • strval(true) results in "1"
  • strval(5) results in "5"

This behavior has been fixed in the development branch (planned-release/1.3.0), but it causes a breaking change to GraphQL attribute data. If other customers rely on the old behavior (e.g. false == ""), we don't want to unexpectedly change data reprentation. Because of this, we decided to split these changes into a new blocksDataV2 GraphQL variable to avoid breaking any dependencies on the previous behavior.

Recursive block structure

This PR also changes the structure of block data, preferring a flattened, recursive structure by default. To show the difference, here was a prior query to get all blocks (and their inner blocks) with full attributes:

query OldQuery {
  post(id: 1, idType: DATABASE_ID) {
    blocksData {
      blocks {
        attributes {
          name
          value
          isValueJsonEncoded
        }
        id
        name
        innerBlocks {
          attributes {
            name
            value
            isValueJsonEncoded
          }
          id
          name
          parentId
        }
      }
    }
  }
}

There's some repetition - innerBlocks is basically the same query as blocks, but it must be specified separately with a parentId. attributes' properties must also be repeated in both sections. Here is the new query:

query NewQuery {
  post(id: 1, idType: DATABASE_ID) {
    blocksDataV2 {
      blocks {
        name
        id
        parentId
        attributes {
          name
          value
          isValueJsonEncoded
        }
      }
    }
  }
}

This query will return every block, including inner blocks, as a flattened structure. Each block has a queryable parentId and both root and inner blocks are returned in the same structure. Root blocks will have a null parentId, and non-root blocks will have a specified parentId. The flatten option (default true) was also removed, and every blocksDataV2 query will return flattened data.

For an example of the output in the new format, see Example: Simple nested blocks: core/list and core/quote in the updated README or the block reconstruction example in this branch.

Block reconstruction

Another benefit of the new recursive structure is that block reconstruction code has been simplified. Old block reconstruction code was more complex as it needed to handle outer and inner block processing separately due to their different data structures. The new reconstruction code example is shorter and simpler.

Other changes

The changes that were made in https://github.com/Automattic/vip-block-data-api/pull/66 to blocksData and related tests have been moved to blocksDataV2 and new test files instead. The blocksData (v1) property has been modified to include a deprecated notice in the description, but otherwise will remain unchanged for any plugin users that rely on the prior structure or data representation.

Steps to Test

Automated tests:

  1. Check out PR.
  2. Run wp-env start
  3. Run composer install
  4. Run composer run test

Manual tests:

  1. Install this PR's branch (add/recursive-graphql-format) on a WordPress site.

  2. Also ensure the WPGraphQL plugin is installed on the same site.

  3. Create a post with nested data, e.g.

    <!-- wp:columns -->
    <div class="wp-block-columns">
        <!-- wp:column -->
        <div class="wp-block-column">
            <!-- wp:paragraph -->
            <p>Left column</p>
            <!-- /wp:paragraph -->
        </div>
        <!-- /wp:column -->
    
        <!-- wp:column -->
        <div class="wp-block-column">
            <!-- wp:paragraph -->
            <p>Right column</p>
            <!-- /wp:paragraph -->
        </div>
        <!-- /wp:column -->
    </div>
    <!-- /wp:columns -->
  4. In the WordPress admin, go to GraphQL -> GraphQL IDE.

  5. Run this query, replacing <post-id> with the relevant post ID from step 4:

    query PostQuery {
      post(id: <post-id>, idType: DATABASE_ID) {
        blocksDataV2 {
          blocks {
            name
            id
            parentId
            attributes {
              name
              value
            }
          }
        }
      }
    }
  6. View the result. The data should contain each block in a flattened list, with parentIds corresponding the nested structure of the post:

    {
      "data": {
        "post": {
          "blocksDataV2": {
            "blocks": [
              {
                "name": "core/columns",
                "id": "QmxvY2tEYXRhVjI6NDY6MQ==",
                "parentId": null,
                "attributes": [
                  {
                    "name": "isStackedOnMobile",
                    "value": "true"
                  }
                ]
              },
              {
                "name": "core/column",
                "id": "QmxvY2tEYXRhVjI6NDY6Mg==",
                "parentId": "QmxvY2tEYXRhVjI6NDY6MQ==",
                "attributes": null
              },
              {
                "name": "core/paragraph",
                "id": "QmxvY2tEYXRhVjI6NDY6Mw==",
                "parentId": "QmxvY2tEYXRhVjI6NDY6Mg==",
                "attributes": [
                  {
                    "name": "content",
                    "value": "Left column"
                  },
                  {
                    "name": "dropCap",
                    "value": "false"
                  }
                ]
              },
              {
                "name": "core/column",
                "id": "QmxvY2tEYXRhVjI6NDY6NA==",
                "parentId": "QmxvY2tEYXRhVjI6NDY6MQ==",
                "attributes": null
              },
              {
                "name": "core/paragraph",
                "id": "QmxvY2tEYXRhVjI6NDY6NQ==",
                "parentId": "QmxvY2tEYXRhVjI6NDY6NA==",
                "attributes": [
                  {
                    "name": "content",
                    "value": "Right column"
                  },
                  {
                    "name": "dropCap",
                    "value": "false"
                  }
                ]
              }
            ]
          }
        }
      }
    }
alecgeatches commented 1 month ago

@Zamfi99 Tagging you here as this will affect the changes you made in #66 and likely want to use. The new variable behavior has been moved to blockDataV2, and the block structure has been modified.

See the issue description for the "why" on the modification, but in general it should make the block format shorter, easier to query, and reconstruct. Hopefully this will be easy to integrate into your existing pipeline for consuming post data in GraphQL, but if not the blocksData variable will remain in place. Thank you!

Zamfi99 commented 1 month ago

@alecgeatches thank you. One proposal: If we are considering introducing a new mechanism for querying the blocks, would it be reasonable to eliminate the attribute selection? It seems unlikely that there would be a scenario where retrieving only the names of the attributes is necessary. Consequently, we could also remove the isValueJsonEncoded property.

alecgeatches commented 1 month ago

If we are considering introducing a new mechanism for querying the blocks, would it be reasonable to eliminate the attribute selection? It seems unlikely that there would be a scenario where retrieving only the names of the attributes is necessary. Consequently, we could also remove the isValueJsonEncoded property.

While I agree that specifically requesting attribute data (name, value, isValueJsonEncoded) is pretty much always going to be required for attributes to be useful, I think this generally follows GraphQL patterns for querying properties. The only alternative I can think of is JSON-encoding the name/attribute values into one string, but that isn't an obvious improvement to me.

What are you suggesting as an alternative? Thank you!

Zamfi99 commented 1 month ago

Although it adheres to GraphQL patterns for querying properties, I believe implementing it too granularly is unnecessary. As you mentioned, all properties are required for the attributes to be useful. Therefore, I propose returning the attributes as an object, where the keys represent the attribute names and the values correspond to the attribute values.

Request

query PostQuery {
  post(id: <post-id>, idType: DATABASE_ID) {
    blocksDataV2 {
      blocks {
        name
        id
        parentId
        attributes
      }
    }
  }
}

Response

{
  "data": {
    "post": {
      "blocksDataV2": {
        "blocks": [
          {
            "name": "core/columns",
            "id": "QmxvY2tEYXRhVjI6NDY6MQ==",
            "parentId": null,
            "attributes": {
               "isStackedOnMobile": true
            }
          },
          {
            "name": "core/column",
            "id": "QmxvY2tEYXRhVjI6NDY6Mg==",
            "parentId": "QmxvY2tEYXRhVjI6NDY6MQ==",
            "attributes": {}
          },
          {
            "name": "core/paragraph",
            "id": "QmxvY2tEYXRhVjI6NDY6Mw==",
            "parentId": "QmxvY2tEYXRhVjI6NDY6Mg==",
            "attributes": {
                "content": "Left column",
                "dropCap": false
            }
          },
          {
            "name": "core/column",
            "id": "QmxvY2tEYXRhVjI6NDY6NA==",
            "parentId": "QmxvY2tEYXRhVjI6NDY6MQ==",
            "attributes": {}
          },
          {
            "name": "core/paragraph",
            "id": "QmxvY2tEYXRhVjI6NDY6NQ==",
            "parentId": "QmxvY2tEYXRhVjI6NDY6NA==",
            "attributes": {
                "content": "Right column",
                "dropCap": false
            }
          }
        ]
      }
    }
  }
}
alecgeatches commented 1 month ago

Thank you for the detailed example! We considered this originally, but we had the issue that to display a know set of values like

"attributes": {
  "isStackedOnMobile": true
}

we also need to generate a type for that attribute object, e.g. type ColumnsBlockAttributes { isStackedOnMobile: Boolean! } and also represent this in the GraphQL schema. Similarly, we'd need to do the same for all other block attribute types.

This is how WPGraphQL-Gutenberg does things. It's very powerful, but querying has to be pretty specific per-block:

blocks {
  # fields from the interface
  clientId
  isDynamic
  ... on CoreParagraphBlock {
    attributes {
      ... on CoreParagraphBlockAttributes {
        content
      }
    }
  }
}

CoreParagraphBlock and CoreParagraphBlockAttributes are required types for GraphQL to match the content property from the schema. As far as I know GraphQL doesn't allow you to specify a flexible "object" type in a schema without predefining those attributes, and those attributes vary by block.

This is how we made the original decision to use name/value pairs. It can be a clumsy-looking syntax, but we have a single query that works for all block attributes. It takes an extra step to reconstitute those attributes into a key-value object, but removes a lot of overhead during querying.

alecgeatches commented 1 month ago

Merging this in! Per our discussion above, if we do find a better way to expose attributes we can also add a new property later like attributesObject. For now, I'm going to merge and proceed with the 1.3.0 release. Thank you for the discussion!