easyops-cn / docusaurus-search-local

Offline/local search for Docusaurus v2/v3
https://easyops-cn.github.io/docusaurus-search-local/
MIT License
710 stars 90 forks source link

Bypass route protection on search-index build (server)? #210

Open JohnBra opened 2 years ago

JohnBra commented 2 years ago

Hi!

Love the plugin/theme you have built here!

My Setup

Docusaurus tutorial page modified with a client side auth wrapper around the root element.

Problem

After running the build my search-index.json file includes all routes/pages but is missing context information. This leads to always "no results" when searching:

[{"documents":[{"i":1,"t":"","u":"/blog/first-blog-post","b":[]},{"i":2,"t":"","u":"/blog/long-blog-post","b":[]},{"i":3,"t":"","u":"/blog/archive","b":[]},{"i":4,"t":"","u":"/blog/welcome","b":[]},{"i":5,"t":"","u":"/docs/hello","b":[]},{"i":6,"t":"","u":"/docs/intro","b":[]},{"i":7,"t":"","u":"/blog/mdx-blog-post","b":[]},{"i":8,"t":"","u":"/docs/tutorial-basics/congratulations","b":[]},{"i":9,"t":"","u":"/docs/tutorial-basics/create-a-document","b":[]},{"i":10,"t":"","u":"/docs/tutorial-basics/create-a-blog-post","b":[]},{"i":11,"t":"","u":"/docs/tutorial-basics/markdown-features","b":[]},{"i":12,"t":"","u":"/docs/tutorial-extras/translate-your-site","b":[]},{"i":13,"t":"","u":"/docs/tutorial-extras/manage-docs-versions","b":[]},{"i":14,"t":"","u":"/docs/tutorial-basics/create-a-page","b":[]},{"i":15,"t":"","u":"/docs/tutorial-basics/deploy-your-site","b":[]}],"index":{"version":"2.3.9","fields":["t"],"fieldVectors":[["t/1",[]],["t/2",[]],["t/3",[]],["t/4",[]],["t/5",[]],["t/6",[]],["t/7",[]],["t/8",[]],["t/9",[]],["t/10",[]],["t/11",[]],["t/12",[]],["t/13",[]],["t/14",[]],["t/15",[]]],"invertedIndex":[],"pipeline":["stemmer"]}},{"documents":[],"index":{"version":"2.3.9","fields":["t"],"fieldVectors":[],"invertedIndex":[],"pipeline":["stemmer"]}},{"documents":[],"index":{"version":"2.3.9","fields":["t"],"fieldVectors":[],"invertedIndex":[],"pipeline":["stemmer"]}}]

Description

I have somewhat of an edge case problem.. I need to protect the whole page via client side authentication. I'm aware that the content is only "hidden", which is sufficient for my usecase (no sensitive content).

What I have done to enable the authentication is:

  1. Add a theme/Root.js to the standard docusaurus starter project
  2. Wrap {children} in the auth context provider
  3. Protect routes with a component that checks for authentication and returns children, or redirects to an authentication page if not authenticated
function WithAuth({ children }) {
  const { isLoading, error, isAuthenticated, loginWithRedirect } = useAuth0();
  const location = useLocation();
  // catch loading or error states
  if (isLoading) return <div>Loading...</div>;
  if (error) return <div>Oops... {error.message}</div>;

  if (!isAuthenticated) {
    loginWithRedirect({ appState: { returnTo: location.pathname }});
    return <div>Redirecting ...</div>;
  } else {
    return <>{children}</>;
  }
}

export default function Root({children}) {
  const { siteConfig } = useDocusaurusContext();
  const history = useHistory();
  const location = useLocation();

  const onRedirectCallback = (appState) => {
    history.push(appState?.returnTo || location.pathname);
  };

  return (
    <Auth0Provider
      domain={siteConfig.customFields.AUTH0_DOMAIN}
      clientId={siteConfig.customFields.AUTH0_CLIENT_ID}
      redirectUri={siteConfig.customFields.BASE_URL}
      onRedirectCallback={onRedirectCallback}
    >
      <WithAuth>
        {children}
      </WithAuth>
    </Auth0Provider>
  );
}

This works as it should.

Now, when I build the project it generates the search-index.json, but the index is missing all the respective information for each page/md file and so forth (see above). Means auth works, but search doesn't.

=> It seems like the local search crawler finds all of the pages, but can't access the content data because of the auth wrapper.

When I remove the auth wrapper the search-index.json is correctly populated with the content data and search works fine, but auth doesn't.

Question

Is there any way to bypass the auth during the server build? Alternatively, maybe even a setting to turn off the auto generation of search-index.json on every build? This would mean I have to manually build the search-index.json and put it into the static folder before every push, but I guess that's better than no search at all.

I obviously prefer some way to bypass the auth during the build process for automation purposes.

Cheers

JohnBra commented 2 years ago

Small update.. I have gone into the code of this lib and found out what exactly the problem is.

When the Auth wrapper is present the SSR build consists of all pages with Loading... , hence the empty search index. This is intended, since we want to authenticate the user before showing anything first.

If I implement a check for "isBrowser" and disable the auth for the SSR, search works again as the search-index.json is built with the content being there instead of Loading.... This results in a short flash of the respective page content when someone visits a page (SSR) before going into a loading state on the client side (CSR). This is just how SSR in conjunction with CSR works and should work for a good UX.

The only possible solution here would be to provide some sort of optionial building of the search-index.json, but I understand if that is out of scope for this lib.

weareoutman commented 2 years ago

Interesting case, I didn't know we can implement an auth guard with Docusaurus, since it's a static site generator.

This plugin builds the search index right after Docusaurus built your pages, I don't know what's the output pages when you wrapped the root with an auth guard, if they contain loading... only, I think there is nothing we can do in one build.

Maybe you can build twice, one with auth enabled, another with auth disabled. Then move the search-index.json from the latter output into the former output. But, yes, it's out of scope for this lib.

JohnBra commented 2 years ago

@weareoutman haha yes. Bit of a hacky solution to be honest. Works better than I expected because of the way the SSR pages are built.

Basically all pages, both SSR and CSR, are being rendered as <div>Loading...</div> (see the WithAuth wrapper in my initial post above). Because of this the search plugin can't read any content, but all the pages are present in the search index.

I agree it won't be possible in one build. I think the easiest solution to at least enable search for this use case is making the search-index.json build by this plugin optional.

That way I could for example build a small hook, that builds it locally, put the search-index.json in the static folder and just access it on the client. Currently my pipeline automatically generates the search-index.json in a container and overwrites a statically built one, even if I provide it first.

EDIT: you can probably close this issue, unless you think an optional build step is in the picture. Thanks anyway!

tisonkun commented 1 year ago

I write such a code snippet and it builds the search index while I protect the site with Auth0:

import useIsBrowser from "@docusaurus/useIsBrowser";

export default function Root({children}): JSX.Element {
    const isBrowser = useIsBrowser();
    return isBrowser ? (<Auth0Provider ...> ... </Auth0Provider>) : <>{children}</>;
}

UPDATE: Not exactly. useIsBrowser will be false before React loaded and it leaks the page in the first place.

tisonkun commented 1 year ago

Replace isBrowser with env var BUILD and wrap a script:

#!/usr/bin/env bash

ROOT_DIR=$(git rev-parse --show-toplevel)
# shellcheck disable=SC2164
cd "$ROOT_DIR"

BUILD=1 yarn docusaurus build
mv build/search-index.json search-index.json
BUILD=0 yarn docusaurus build
mv search-index.json build/search-index.json
export default function Root({children}): JSX.Element {
    const {siteConfig} = useDocusaurusContext();
    const {BUILD} = siteConfig.customFields;
    return (BUILD > 0) ? <>{children}</> : (<Auth0Provider ...> ... </Auth0Provider>);
}
BenjaminYde commented 8 months ago

Hi,

I have the same problem when running in production. See my Root.js:

import React, { useEffect, useState} from 'react';
import { MsalProvider, AuthenticatedTemplate, UnauthenticatedTemplate, useMsal, useMsalAuthentication} from "@azure/msal-react";
import { PublicClientApplication, InteractionStatus, InteractionType} from "@azure/msal-browser";
import useDocusaurusContext from '@docusaurus/useDocusaurusContext';

// Default implementation, that you can customize
export default function Root({children}) {

  const {siteConfig} = useDocusaurusContext();

  const isProduction = process.env.NODE_ENV === 'production';
  const redirectUri = isProduction ? `https://${siteConfig.organizationName}.github.io${siteConfig.baseUrl}index.html` : 'http://localhost:3000/';

  const msalConfig = {
    auth: {
      clientId: "xxx",
      authority: 'xxx',
      redirectUri: redirectUri
    },
    cache: {
      cacheLocation: 'localStorage',
      storeAuthStateInCookie: false,
    }
  };

  const pca = new PublicClientApplication(msalConfig);
  const isLocalDevelopment = process.env.NODE_ENV === 'development';

  const authProvider = isLocalDevelopment ? (
      <main>
        {children}
      </main>
    ) : (
      <MsalProvider instance={pca}>

        <UnauthenticatedTemplate>
          <Login/>
        </UnauthenticatedTemplate>

        <AuthenticatedTemplate>
          <main>{children}</main>
        </AuthenticatedTemplate>

      </MsalProvider> 
    );

  return <>{authProvider}</>;
}

function Login() {
  const request = {
    scopes: ["User.Read"],
    prompt: 'select_account'
  }

  const { login, error } = useMsalAuthentication(InteractionType.Silent, request);
  const msalInstance = useMsal();

  useEffect(() => {
    if (error && msalInstance.inProgress === InteractionStatus.None) {
      login(InteractionType.Redirect, request);
    }
  }, [error, login, request, msalInstance]);

  return null;
};

When I remove the Authentication components from the MsalProvider it works fine: But then I do not have my authentication wrapper ;)

<MsalProvider instance={pca}>
    <main>{children}</main>
</MsalProvider> 

Is there a way to manually add the Search bar component / force it? Setting the forceIgnoreNoIndex: true in the docusaurus.config.js does not work for me. It's like when the site is re-rendered or something, the search bar dissappears?