gatsbyjs / gatsby

The best React-based framework with performance, scalability and security built in.
https://www.gatsbyjs.com
MIT License
55.19k stars 10.33k forks source link

[Discussion] Feature: GraphQL schema snapshots for all data sources to solve undefined/empty data issues #3344

Closed jsanchez034 closed 5 years ago

jsanchez034 commented 6 years ago

Description

Currently there are issues with GraphQL schemas produced from data sources where at the moment gatsby develop or gatsby build is executed the data shape is incomplete, parts are in an empty state or in a different type than they would be if the source data was filled out. Below are a few example issues..

It would be great if as the data shape evolves on the data source side, you could create test pages or pieces of content that are fully filled out, meaning no empty fields. Then on the Gatsby side you could run a cli command called something like gatsby snapshot-schemas which would fetch the current data sources, run the source data through there regular plugin data normalization paths, run the data through the existing infer GraphQL schema code and then finally at the end take the schemas generated and save them off to a folder in /src called schemas.

On subsequent builds Gatsby could skip over inferring of the GraphQL schemas when it sees schemas defined in /src/schemas. These schema snapshots could then committed into a sites repo and allow for data shape changes to require new snapshots instead of just data source changes. These schema snapshots open up many possibilities such as the validation of incoming data shape changes. Schema snapshot diffs could be shown in Gatsby CLI as well when gatsby snapshot-schemas is ran again once initial schemas have been saved.

I would love some feedback on the idea itself from the contributors of the various source plugins. If the idea passes the smell test, I would like some feedback on what format the snapshots should be saved to. Maybe the GraphQL schema language using something like gestalt-graphql would be nice.

shevchenkonik commented 5 years ago

@moreguppy I got a similar solution. You create one ready page Test and hide this page. For example:

allContentfulPageExample( filter: { title: { ne: "Test" } } )

stoltzrobin commented 5 years ago

@moreguppy @loeildes Yea, that's exactly how we also solved it in the meantime. But it makes it harder in the long run, every time you add a new content model/new field you need to create/update all "dummy" entries created.

sami616 commented 5 years ago

creating dummy entries has worked for me as a temporary work around. not ideal for the reasons mentioned above, is this likely to be looked at any time soon?

bgnz968 commented 5 years ago

still no fix in sight? we have a launch in 21 days

pieh commented 5 years ago

@bgnz968 It's very unlikely there will be proper fix (or rather feature) land in 21 days. I don't have any dates here. There is a lot of problems that need to be solved here - some related to gatsby builtin inferred schema, some to 3rd party stitched schemas ( https://github.com/gatsbyjs/rfcs/pull/11 ).

metamas commented 5 years ago

Figure I'll throw a log on the fire too... I also have to do the "dummy entry" work-around to avoid build breakage on my handful of projects that use gatsby-source-prismic. Like @stoltzrobin voiced: it's a good enough temporary fix, but it's hacky.

Schema snapshots feels like too many steps to get something that seems so basic and essential (i.e. optional data fields). But it sounds like sounds like that or explicitly writing the expected schema is the only way to allow optional data fields that are more complex than simple strings or numbers.

jademh commented 5 years ago

I'm also using dummy entries workaround to handle this but agree that it's v hacky. Would love to see a fix!

Jerbach commented 5 years ago

Also using @loeildes solution. Just adding my voice here hoping there will be a more optimal solution soon.

Khaledgarbaya commented 5 years ago

Hey Folks, I haven't tested this but this might be a p[otential Clean workaround that does not require creating dummy content.

https://medium.com/@Zepro/contentful-reference-fields-with-gatsby-js-graphql-9f14ed90bdf9

sami616 commented 5 years ago

@Khaledgarbaya I've not had much time to test your solution however, at a glance i can only seem to get this to by by spreading Node like this ?

... on Node {
    ... on FieldType {
     # ...
   }
}
lkol commented 5 years ago

Yeah, the workaround only works for "polymorphic" references, but not for normal optional fields

hanoii commented 5 years ago

Might be late to the party, also stumbled upon this on Drupal source plugin, and not only due to missing fields, but also missing entities altogether making fragments failing to work.

I believe @pieh's workaround is going to be enough but I want mention something on my mind.

I see the take of https://www.drupal.org/project/schemata using http://json-schema.org/ useful here. Maybe we can leverage this and have an optional way of providing a json-schema instead of infering it. How this json schema is exposed will be then partially dependent on the plugin.

sami616 commented 5 years ago

It would be awesome to have a fix for this. Using placeholder content does work but feels hacky and makes me a nervous especially when you have nested references

rexxars commented 5 years ago

I have created a similar but slightly differently scoped issue in #10856 - how a source plugin can fix the issue of missing fields and node types given that a schema is available to inform Gatsby of what is missing.

pieh commented 5 years ago

@rexxars There is another ticket related to this - https://github.com/gatsbyjs/gatsby/issues/4261 . @stefanprobst did work on prototype that allow providing types - please check this comment in particular https://github.com/gatsbyjs/gatsby/issues/4261#issuecomment-442549881 - there is example of defining Node types and their shape (even if no data is available to infer from)

sami616 commented 5 years ago

@pieh thanks! This looks promising!

diegotrigo commented 5 years ago

The original post is from Dec 2017. Starting to lose hope we'll ever find a proper solution.

mattclough1 commented 5 years ago

I know this isn't a solution to the problem in this version of Gatsby, but Gatsby 2.0's gatsby-source-graphql has pretty much eliminated schema inference problems in my experience.

stefanprobst commented 5 years ago

Everyone: this is actively being worked on in #11480.

stefanprobst commented 5 years ago

We have an alpha version of our new Schema Customization API ready, and we'd love your feedback! For more info check out this blogpost.

sami616 commented 5 years ago

this is very exciting, thanks for all your hard work @stefanprobst !

vermario commented 5 years ago

I have seen the new API in preview: however it's not clear to me how that could be used to "snapshot" the existing graphql schema to solve the issue that was discussed in this ticket. I saw there is for example https://www.gatsbyjs.org/packages/gatsby-plugin-extract-schema/ with some simple code to extract the schema to a json file. Would the idea be that then something like that could be "fed again" into the gatsby schema?

(Crossing my finger because this is at the moment the biggest headache we are having in our project(s), using drupal + gatsby_source_drupal plugin) and it would be so great to be able to fix it cleanly). :)

pieh commented 5 years ago

@vermario It doesn't support creating schema snaphost just yet. But it should solve the issue better than creating snapshot. Once this is merged we will start implementing type definitions to plugins (including Drupal one). If Drupal has introspection endpoint (I'm not expert on drupal). Then we will be able to generate correct schema even without any data, so snapshot wouldn't be needed at all

vermario commented 5 years ago

@pieh sounds great: we will keep watching this space. In the meantime we are adding "test" content ( :/ ) and excluding the content from graphql queries at build time. Feels "dirty" :) Thanks!

sami616 commented 5 years ago

Im running into an issue when using the new schema customisation with gatsby-source-contentful.

Defining a type where one of the fields points to a reference array always returns null

Ie:

type ContentfulPage implements Node {
   editorTitle: String
   title: String
   slug: String
   sections: [ContentfulSection] // im always null 
}
gatsbot[bot] commented 5 years ago

Hiya!

This issue has gone quiet. Spooky quiet. 👻

We get a lot of issues, so we currently close issues after 30 days of inactivity. It’s been at least 20 days since the last update here.

If we missed this issue or if you want to keep it open, please reply here. You can also add the label "not stale" to keep this issue open!

Thanks for being a part of the Gatsby community! 💪💜

gatsbot[bot] commented 5 years ago

Hey again!

It’s been 30 days since anything happened on this issue, so our friendly neighborhood robot (that’s me!) is going to close it.

Please keep in mind that I’m only a robot, so if I’ve closed this issue in error, I’m HUMAN_EMOTION_SORRY. Please feel free to reopen this issue or create a new one if you need anything else.

Thanks again for being part of the Gatsby community!

moreguppy commented 5 years ago

Im running into an issue when using the new schema customisation with gatsby-source-contentful.

Defining a type where one of the fields points to a reference array always returns null

Ie:

type ContentfulPage implements Node {
   editorTitle: String
   title: String
   slug: String
   sections: [ContentfulSection] // im always null 
}

@sami616 I found this approach worked:

// In gatsby-node.js
exports.sourceNodes = ({ actions }) => {
  const { createTypes } = actions
  const typeDefs = `
    type ContentfulPage implements Node {
      sections: ContentfulSection
    }
    type ContentfulSection implements Node {
      title: String
    }
  `
  createTypes(typeDefs)
}

It seems for reference fields, you have to define the shape of the model you are referencing as well

imshuffling commented 5 years ago

So I'm trying to work this out. I have a field of "Video" (that is not required in contentful) on a content type of "ContentfulHomepage" with a reference field called "Blocks" calling "ContentfulBlockHomepageBanner".

Works fine if video field has content... but falls over if null, so here's my stab at the empty field issue.

Query below. Screenshot 2019-05-02 at 11 35 41

My attempt... Screenshot 2019-05-02 at 11 34 59

Erroring out... Screenshot 2019-05-02 at 11 34 41

Can anyone point out where I'm going wrong here? Thanks.

gatsbot[bot] commented 5 years ago

Hey again!

It’s been 30 days since anything happened on this issue, so our friendly neighborhood robot (that’s me!) is going to close it.

Please keep in mind that I’m only a robot, so if I’ve closed this issue in error, I’m HUMAN_EMOTION_SORRY. Please feel free to reopen this issue or create a new one if you need anything else.

As a friendly reminder: the best way to see this issue, or any other, fixed is to open a Pull Request. Check out gatsby.dev/contribute for more information about opening PRs, triaging issues, and contributing!

Thanks again for being part of the Gatsby community!

silviopaganini commented 5 years ago

am I the only one still having this problem?

KyleAMathews commented 5 years ago

Sorry we're working towards this. I'll set this as not stale so the bot stops closing it.

moreguppy commented 5 years ago

Im running into an issue when using the new schema customisation with gatsby-source-contentful. Defining a type where one of the fields points to a reference array always returns null Ie:

type ContentfulPage implements Node {
   editorTitle: String
   title: String
   slug: String
   sections: [ContentfulSection] // im always null 
}

@sami616 I found this approach worked:

// In gatsby-node.js
exports.sourceNodes = ({ actions }) => {
  const { createTypes } = actions
  const typeDefs = `
    type ContentfulPage implements Node {
      sections: ContentfulSection
    }
    type ContentfulSection implements Node {
      title: String
    }
  `
  createTypes(typeDefs)
}

It seems for reference fields, you have to define the shape of the model you are referencing as well

@silviopaganini I found this is a pretty good solution in the interim

silviopaganini commented 5 years ago

Yeah! trying this now, seems to work, but it's not scalable... I'm working on a huge website, if I have to create those for all non-required fields is not great, but works for now

miraclemaker commented 5 years ago

@moreguppy I'm trying this to fix a similar issue with gatsby-source-wordpress, but I get the following message when I add this code to gatsby-node.js:

Error: Schema must contain uniquely named types but contains multiple types named "wordpress__PAGEAcf".

My code:

exports.sourceNodes = ({ actions }) => {
  const { createTypes } = actions
  const typeDefs = `
    type wordpress__PAGEAcf implements Node {
      enter_content: String
    }
  `
  createTypes(typeDefs)
}
miraclemaker commented 5 years ago

Nevermind, for anyone experiencing a similar issue, this is the correct format ti fix it:

exports.sourceNodes = ({ actions }) => {
  const { createTypes } = actions
  const typeDefs = `
    type wordpress__PAGE implements Node {
      acf: wordpress__PAGEAcf
    }
    type wordpress__PAGEAcf implements Node {
      enter_content: String
    }
  `
  createTypes(typeDefs)
}
gatsbot[bot] commented 5 years ago

Hey again!

It’s been 30 days since anything happened on this issue, so our friendly neighborhood robot (that’s me!) is going to close it.

Please keep in mind that I’m only a robot, so if I’ve closed this issue in error, I’m HUMAN_EMOTION_SORRY. Please feel free to reopen this issue or create a new one if you need anything else.

As a friendly reminder: the best way to see this issue, or any other, fixed is to open a Pull Request. Check out gatsby.dev/contribute for more information about opening PRs, triaging issues, and contributing!

Thanks again for being part of the Gatsby community!

kalinchernev commented 5 years ago

Re-opening and adding a not stale tag as of https://github.com/gatsbyjs/gatsby/issues/3344#issuecomment-505001979

stefanprobst commented 5 years ago

I'd be super grateful if someone could give the stuff in #16291 some real-world testing. Thanks!

stefanprobst commented 5 years ago

We have a open PR for schema lock-down in #16291. Would be cool if those still interested in seeing this land could share some feedback. Thanks!

m4rrc0 commented 4 years ago

Hey @stefanprobst . Awesome work! Sorry for the late reply. This has been stalling for so long that I got used to the workarounds and it was not a major problem anymore to me. It is so neat though that this is moving forward! :) I gave it a quick try and it worked seamlessly! That is rare enough to salute. ;D I am working on a big update on a project and I will have some more 'real world' testing in the coming weeks. I'll comment further in the PR. Thanks a lot for the work!

m4rrc0 commented 4 years ago

@stefanprobst any idea about #19210

wu-lee commented 2 years ago

Hello - I'm also having trouble with this still.

I've attempted to fix it as documented here:

https://www.gatsbyjs.com/docs/reference/config-files/gatsby-node/#createSchemaCustomization

And:

https://www.gatsbyjs.org/docs/schema-customization/#creating-type-definitions

However, first I wanted to ask, why is sourceNode is being used in the post above instead of createSchemaCustomization as in the docs? Does it make any difference?

Nevermind, for anyone experiencing a similar issue, this is the correct format ti fix it:

exports.sourceNodes = ({ actions }) => {
  const { createTypes } = actions
  const typeDefs = `
    type wordpress__PAGE implements Node {
      acf: wordpress__PAGEAcf
    }
    type wordpress__PAGEAcf implements Node {
      enter_content: String
    }
  `
  createTypes(typeDefs)
}

In my case I seem to have extra problems. Although I explicitly define fields which can be optional, in order to avoid errors if Gatsby does not find any examples and therefore can't infer their existence, File fields seem to be a problem, as I've not discovered how to define these correctly.

Simply specifying the type as (for example) hero_image: File does not work reliably, as Gatsby doesn't seem to be doing the obvious thing and defining file fields like this when the field refers to an existing file. There's a childmageSharp field the templates need, which seems to function ok when the hero_image field is inferred by Gatsby, but not when I add an explicit type File, when I get warnings that "You can't use childImageSharp together with undefined.undefined — use publicURL instead", and the images disappear.

I'm using a headless CMS to allow my users to enter content, so I can't control what they do exactly, and failures crop up unpredictably depending on what they've created, are not in the least friendly to users or developers.

I find myself caught between these cases. I am encountering the following:

What happens when inference is used all seems very dependent on the first content file Gatsby sees, which isn't something I know how to fix, the order seems non-deterministic. An apparent fix can later turn out to be working only by lucky ordering.

I am currently using Gatsby 2.23.12, not the most recent version as this project is over a year old and still intermittently encountering problems. Upgrading seems to be another can of breaking-change worms I don't yet want to open.

fgroenendijk commented 1 year ago

My team had a problem inferring complex types like dynamic zones and some fields that could be empty. Then I found out about the command printTypeDefinitions. When you first populate all the fields once, then let printTypeDefinitions create a typeDefs file. After this you can copy and paste the problem cases like for example wordpress__PAGE in the typedef.

Example code:

import * as fs from 'fs';
exports.sourceNodes = ({ actions }) => {
    if (fs.existsSync('./typeDefs.txt')) {
        fs.rmSync('./typeDefs.txt');
    }
    actions.printTypeDefinitions({ path: './typeDefs.txt' });

    const { createTypes } = actions;
    const typeDefs = '';
    createTypes(typeDefs);
}

Used with gatsby 5.7.0.

tractorcow commented 5 months ago

@fgroenendijk this is the solution we used. It could be better, but it's setup to refresh the schema during development, and rely on this cached schema on deployed environments. It still relies on our non-master branch having all content fields filled, but it won't break production.


// Only rebuild definitions on non-master branches, and not on codebuild
const rebuildDefinitions = process.env.GATSBY_CONTENTFUL_ENVIRONMENT !== 'master' && !process.env.CODEBUILD_BUILD_ARN
const definitionPath = 'src/type-definitions.gql'
const schemaPath = path.resolve(__dirname, definitionPath)

export const onPreInit: GatsbyNode['onPreInit'] = async () => {
  // On non-production environments clear outdated type definitions to force a resync
  if (rebuildDefinitions && fs.existsSync(schemaPath)) {
    console.log(`Deleting types from ${definitionPath}`)
    fs.unlinkSync(schemaPath)
  }
}

export const createSchemaCustomization: GatsbyNode['createSchemaCustomization'] = async ({ actions }) => {
  const { createTypes, printTypeDefinitions } = actions

  // ON UAT this will save type definitions to a file
  if (rebuildDefinitions) {
    // On non-master branch we persist types
    console.log(`Saving types to ${definitionPath}`)
    printTypeDefinitions({ path: definitionPath, withFieldTypes: true })
  } else {
    // Otherwise we restore types
    console.log(`Restoring types from ${definitionPath}`)
    const typeDefs = fs.readFileSync(schemaPath, 'utf8')
    createTypes(typeDefs)
  }
}