gatsbyjs / gatsby

The best React-based framework with performance, scalability and security built in.
https://www.gatsbyjs.com
MIT License
55.29k stars 10.31k forks source link

Proposal: use `allSitePage` paths to make `Link` automatic #13887

Closed coreyward closed 5 years ago

coreyward commented 5 years ago

Summary

I'm proposing an update to the Link component that will automatically determine whether a path is capable of being handled internally or not during a build (or when running develop) and writing out the appropriate component.

Motivation

Currently Gatsby's Link component is used to handle internal links and it's up to developers to ensure that only internal links are handed to it. For websites that don't have too many pages, the following approach using a StaticQuery works well:

const AutoAnchor = props => (
  <StaticQuery
    query={graphql`
      {
        pages: allSitePage {
          nodes {
            path
          }
        }
      }
    `}
    render={({ pages }) => {
      let { href, ...anchorProps } = props
      const internalPagePaths = pages.nodes.map(({ path }) => path)

      if ((href.match(/^\/[^/]/) && internalPagePaths.includes(href)) || href === "/") {
        return <Link to={href} {...anchorProps} />
      } else {
        return <a href={href} {...anchorProps} />
      }
    }}
  />
)

This requires sending all of the data from the allSitePage query to the client, though, which hinders performance for all sites, but especially for sites with large numbers of pages. It seems possible to add this behavior automatically in the Link component but only to reference the nodes when the site is building (or in dev).

In this scenario, during a build Link would conditionally render either an internal or external link, but the client-side component would have the conditional code stripped out—if it was built as an internal link, it works the same as today's Link client-side, and if it was built as an external link it would render a normal a tag.


I'd be happy to work on this if this is something that is likely to be merged, though I could probably use a little guidance on how to approach running queries from the Link component and how to setup the code splitting.

LekoArts commented 5 years ago

Why do you think that using regex like shown here is not enough to make a "universal" link component?

it's up to developers to ensure that only internal links are handed to it. For websites that don't have too many pages, the following approach using a StaticQuery works well

Because we don't encourage this approach or at least I don't remember us writing that approach somewhere down.

coreyward commented 5 years ago

@LekoArts Using a regular expression is fine for small sites that only ever link to HTML pages generated by Gatsby, but it's inadequate for less simple websites.

If you want to link to a local image or some other asset, or if you're using a router (e.g. AWS Route 53) or reverse proxy (e.g. nginx) to route various paths to different apps (more common than you think), you can't use a regular expression (particularly true if your Gatsby app is not the source of truth for what paths get generated or routed to it).

Regexs also don't work if you're trying to replace an existing website that may have had other URLs that you now need to rewrite: you can setup your rewrites in Netlify or nginx or wherever, but Gatsby doesn't know about them and will show a 404 for them.

I'm sure there are other hypothetical scenarios, but these three are real situations I found myself in that led to the development of this approach.

we don't encourage this approach or at least I don't remember us writing that approach somewhere down.

I developed this pattern for a client's website to support the needs of the project (see above), and have since found it to be invaluable on multiple other projects. Now that you're aware of it, you could start encouraging its use, or better yet, we can just bake it into the app in a way that doesn't result in an array of paths getting sent down to the client unnecessarily.

gatsbot[bot] commented 5 years ago

Hiya!

This issue has gone quiet. Spooky quiet. 👻

We get a lot of issues, so we currently close issues after 30 days of inactivity. It’s been at least 20 days since the last update here.

If we missed this issue or if you want to keep it open, please reply here. You can also add the label "not stale" to keep this issue open!

As a friendly reminder: the best way to see this issue, or any other, fixed is to open a Pull Request. Check out gatsby.dev/contributefor more information about opening PRs, triaging issues, and contributing!

Thanks for being a part of the Gatsby community! 💪💜

wardpeet commented 5 years ago

I'm not 100% sure how you would do this? Putting in all strings inside the link component doesn't seem to scale super well. Something with babel macros might be a possible solution on gatsby build but on gatsby develop we probably just do a runtime check.

It still sounds like a lot of effort. I'm not against this! Just not 100% sure what a solution is :)

coreyward commented 5 years ago

@wardpeet Gatsby already knows about allSitePages during the develop or build phase. What I'm suggesting is for the Link component to utilize that information server-side during the build (or ongoing in develop) to determine whether or not the path can be served by Gatsby.

KyleAMathews commented 5 years ago

I think this would be a fantastic plugin. But as a core feature, it'd be hard to justify as this is somewhat of a niche need and also, it doesn't scale very far past a 1000 pages or so as the metadata size grows quickly.

coreyward commented 5 years ago

@KyleAMathews Which part are you referring to not scaling well?

To be clear, the code sample I shared was to demonstrate the current strategy that I am currently using. This technically works, but yeah, it's inefficient on sites with more than a couple dozen pages. There is no alternative to this approach currently (as I outline above), though, which is why I want to only do this on the server and then rehydrate without the conditional checking (i.e. strip away the conditional logic with webpack). Perhaps that's too ambitious and we'll need to disable the SPA aspect of Gatsby entirely, which would suck.

coreyward commented 5 years ago

For context / as an example, say we have a marketing website and blog with 3,000 pages that deploys to www.example.com, and we also have a production app running on that domain. To make this work, we've setup Apache to serve static files if they exist, and route the request to the production app if not. For marketing and technical reasons, we cannot migrate either website to another subdomain or use a path prefix.

When an HTTP request is made to load any resource path (i.e. any initial pageload), our server can first look for a static file matching, and if it misses, the request can be handled by our production application. When a page is internally routed using Gatsby, though, there's no way for Gatsby to know whether the page is static or not based on the path alone without having a full rolodex of Gatsby pages to check against.

KyleAMathews commented 5 years ago

Ah ok interesting.

One approach too would be to wrap our Link component with one that sends a HEAD request to your server. Perhaps there's some way you could distinguish between Gatsby generated pages and ones from your app?

Another approach would be similar to the above but you run your Gatsby site as a GraphQL server and query it to see if a certain page path exists. This should be reasonably scalable (especially with a bit of caching) though it might be safer to move querying to a more purpose-built API. There's an issue to create a gatsby graphql command for this exact sort of thing https://github.com/gatsbyjs/gatsby/issues/10230

gatsbot[bot] commented 5 years ago

Hey again!

It’s been 30 days since anything happened on this issue, so our friendly neighborhood robot (that’s me!) is going to close it.

Please keep in mind that I’m only a robot, so if I’ve closed this issue in error, I’m HUMAN_EMOTION_SORRY. Please feel free to reopen this issue or create a new one if you need anything else.

As a friendly reminder: the best way to see this issue, or any other, fixed is to open a Pull Request. Check out gatsby.dev/contribute for more information about opening PRs, triaging issues, and contributing!

Thanks again for being part of the Gatsby community!