Feature: No way to add Grammarly-like contextual spell-check

SikandAlex commented 1 year ago

My use case is as follows:

1) Whenever the user updates the text of the editor in any way, a request is made with the new text content of the editor to our custom spellcheck API. This is throttled so that it doesn't actually occur for every character input, only once the user has stopped typing to prevent unnecessary network requests.

2) The API response returns a list of ranges. Example: [[0, 5], [5, 9], [9, 15]]. I also know if the range is "good text" or "bad text". Let's assume the [5, 9] range is a spelling error and the rest are normal text ranges.

3) I want to split the editor content into three elements, two normal TextNodes and then a custom node BadSpellingNode that extends TextNode that when hovered shows a contextual menu.

I have tried many solutions to accomplish this. I've tried every variation of listening approaches stated in the documentation. I feel this should be way simpler as I was able to easily accomplish this using slate but abandoned that library once I found that it wasn't typed properly and had a lot of churn resulting from the plate extension project.

Every approach that I utilize either results in an infinite loop: my listener triggers a side effect that updates the text and then triggers the same side effect again. In addition, computing where the caret/selection needs to be on each re-render of the editor (since it is lost and there are now multiple nodes) becomes extremely convoluted.

Am I doing something wrong?

This was previously quite easy in slate. Code isn't super clean here but this was my general approach.

const Leaf = ({ attributes, children, leaf }: RenderLeafProps) => {

  if (leaf.bold) {
    children = <strong>{children}</strong>;
  }

  if (leaf.italic) {
    children = <em>{children}</em>;
  }

  if (leaf.underline) {
    children = <u>{children}</u>;
  }

  const ref = useRef(null);

  const [open, setOpen] = useState(false);
  const [anchorEl, setAnchorEl] = useState<HTMLAnchorElement | null>(null);

  const handleMouseEnter = (event: React.MouseEvent<HTMLSpanElement>) => {
    setAnchorEl(event.currentTarget);
    setOpen(true);
  };

  const handleMouseLeave = (event: React.MouseEvent<HTMSpanElement>) => {
    setAnchorEl(null);
    setOpen(false);
  };
const Leaf = ({ attributes, children, leaf }: RenderLeafProps) => {

  if (leaf.bold) {
    children = <strong>{children}</strong>;
  }

  if (leaf.italic) {
    children = <em>{children}</em>;
  }

  if (leaf.underline) {
    children = <u>{children}</u>;
  }

  const ref = useRef(null);

  const [open, setOpen] = useState(false);
  const [anchorEl, setAnchorEl] = useState<HTMLAnchorElement | null>(null);

  const handleMouseEnter = (event: React.MouseEvent<HTMLSpanElement>) => {
    setAnchorEl(event.currentTarget);
    setOpen(true);
  };

  const handleMouseLeave = (event: React.MouseEvent<HTMSpanElement>) => {
    setAnchorEl(null);
    setOpen(false);
  };

  const strikeColor = 'red';

  const editor = useSlate();

  // Called when a user click to accept a spelling suggestion
  const handleDelete = useCallback(
    (event: React.MouseEvent<HTMLButtonElement>, from: number, to: number, replacementWord: string) => {
      event.preventDefault();
      // Select the data to replace
      Transforms.select(editor, {
        anchor: { path: [0, 0], offset: from },
        focus: { path: [0, 0], offset: to }
      });
      // Create a DataTransfer object
      const dataTransfer = new DataTransfer();
      dataTransfer.setData('text/plain', replacementWord);
      // Replace the data
      ReactEditor.insertTextData(editor, dataTransfer);
    },
    [editor]
  );

  const card = (
    <SpellingCorrectionPopperContainer>
      <CardContent>
        <Stack direction={'row'} spacing={2} alignItems={'center'}>
          <s>
            <SpellingCorrectionOriginalTypography>
              {originalWord}
            </SpellingCorrectionOriginalTypography>
          </s>
          <Box>
            <SpellingCorrectionForwardIcon />
          </Box>
          <SpellingCorrectionButton
            onClick={(event) =>
              handleDelete(
                event,
                leaf.wordRange[0],
                leaf.wordRange[1],
                leaf.replacementWord
              )
            }
          >
            {replacementWord}
          </SpellingCorrectionButton>
        </Stack>

        <Typography marginTop={2}>
          {message}
        </Typography>
      </CardContent>
      <CardActions>
        {/* <Button size="small">Learn More</Button> */}
      </CardActions>
    </SpellingCorrectionPopperContainer>
  );

  return (
    <span
      {...attributes}
      onMouseEnter={
        leaf.badSpelling ? (event) => handleMouseEnter(event) : undefined
      }
      onMouseLeave={
        leaf.badSpelling ? (event) => handleMouseLeave(event) : undefined
      }
      style={{
        textDecoration: leaf.badSpelling ? 'red wavy underline' : undefined
      }}
    >
      {children}

      <Popper open={open} anchorEl={anchorEl} transition>
        {({ TransitionProps }) => (
          <Fade {...TransitionProps} timeout={350}>
            {card}
          </Fade>
        )}
      </Popper>
    </span>
  );
};
  const strikeColor = 'red';

  const editor = useSlate();

  // Called when a user click to accept a spelling suggestion
  const handleDelete = useCallback(
    (event: React.MouseEvent<HTMLButtonElement>, from: number, to: number, replacementWord: string) => {
      event.preventDefault();
      // Select the data to replace
      Transforms.select(editor, {
        anchor: { path: [0, 0], offset: from },
        focus: { path: [0, 0], offset: to }
      });
      // Create a DataTransfer object
      const dataTransfer = new DataTransfer();
      dataTransfer.setData('text/plain', replacementWord);
      // Replace the data
      ReactEditor.insertTextData(editor, dataTransfer);
    },
    [editor]
  );

  const card = (
    <SpellingCorrectionPopperContainer>
      <CardContent>
        <Stack direction={'row'} spacing={2} alignItems={'center'}>
          <s>
            <SpellingCorrectionOriginalTypography>
              {originalWord}
            </SpellingCorrectionOriginalTypography>
          </s>
          <Box>
            <SpellingCorrectionForwardIcon />
          </Box>
          <SpellingCorrectionButton
            onClick={(event) =>
              handleDelete(
                event,
                leaf.wordRange[0],
                leaf.wordRange[1],
                leaf.replacementWord
              )
            }
          >
            {replacementWord}
          </SpellingCorrectionButton>
        </Stack>

        <Typography marginTop={2}>
          {message}
        </Typography>
      </CardContent>
      <CardActions>
        {/* <Button size="small">Learn More</Button> */}
      </CardActions>
    </SpellingCorrectionPopperContainer>
  );

  return (
    <span
      {...attributes}
      onMouseEnter={
        leaf.badSpelling ? (event) => handleMouseEnter(event) : undefined
      }
      onMouseLeave={
        leaf.badSpelling ? (event) => handleMouseLeave(event) : undefined
      }
      style={{
        textDecoration: leaf.badSpelling ? 'red wavy underline' : undefined
      }}
    >
      {children}

      <Popper open={open} anchorEl={anchorEl} transition>
        {({ TransitionProps }) => (
          <Fade {...TransitionProps} timeout={350}>
            {card}
          </Fade>
        )}
      </Popper>
    </span>
  );
};

SikandAlex commented 1 year ago

Here is my approach in lexical:

export function SpellcheckPlugin() {
    const [editor] = useLexicalComposerContext()

    const [languageToolOutput, setLanguageToolOutput] = useState(null)
    const [editorText, setEditorText] = useState('')

    useEffect(() => {
      // editor.setEditable(false)
      editor.update(() => {
        console.log('\n\n\n\n\n\n\n\n')
        const root = $getRoot()
        const allChildren = root.getChildren()
        const allChildrenKeys = root.getChildrenKeys()
        const rootParagraphNode = root.getFirstChild()
        const allTextContent = rootParagraphNode?.getTextContent()

        // Get the caret position and original anchor
        const selection = $getSelection() as RangeSelection
        console.log(selection)

        const result = []

        // @ts-ignore
        if (allTextContent?.length && languageToolOutput?.matches.length && selection) {
          const originalAnchorListIndex = allChildrenKeys.indexOf(selection.anchor.key)
          const originalAnchorRangeIndex = selection.anchor.offset

          // @ts-ignore
          const matchData = languageToolOutput.matches.map(m => [m.offset, m.offset + m.length])
          const res = getConnectedRanges(allTextContent.length, matchData)
          for (const range of res) {
            const rangeText = allTextContent.substring(range[0], range[1])
            if (isRangeInArray(range, matchData)) {
              const textNode = $createTextNode(rangeText)
              result.push(textNode)
            } else {
              const textNode = $createTextNode(rangeText)
              result.push(textNode)
            }
          }

          const newParagraphNode = $createParagraphNode()
          for (const node of result) {
            newParagraphNode.append(node)
          }
          rootParagraphNode?.replace(newParagraphNode)
          const rangeSelection = $createRangeSelection()
          rangeSelection.anchor.key = root.getFirstChildOrThrow().getKey()
          rangeSelection.anchor.offset = 0
          rangeSelection.focus.key = root.getFirstChildOrThrow().getKey()
          rangeSelection.focus.offset = 0
          // const originalIndex = findOriginalIndex(originalAnchorListIndex, originalAnchorRangeIndex, res)
          $setSelection(rangeSelection)

        }
      })
    }, [languageToolOutput])

    const getLanguageToolOutput = (text: string) => {
      languageToolApiClient.check.checkCreate({
        text: text,
        language: 'en-US'
      }).then(res => {
        // @ts-ignore
        // setLanguageToolOutput(res.data)
      })
    }

    useEffect(() => {

      // Listen for changes to overall text content in order to refetch LanguageTool output 
      // const removeTextContentListener = editor.registerTextContentListener(
      //   (textContent: string) => {
      //     console.log('Text content listener ran')
      //     getLanguageToolOutput(textContent)
      //   }
      // )

      // Two possibilities are they edit a TextNode or my cuustom node (which extend textNode)
      const removeMutationListener = editor.registerMutationListener(
        ParagraphNode,
        (mutatedNodes) => {
          console.log('ParagraphNode listener')
          const editorState = editor.getEditorState()
          editorState.read(() => {
            const root = $getRoot()
            const rootParagraph = root.getFirstChildOrThrow()
            // console.log(rootParagraph.getTextContent())
          })
        })

         // Two possibilities are they edit a TextNode or my cuustom node (which extend textNode)
      const removeMutationListenerTwo = editor.registerMutationListener(
        TextNode,
        (mutatedNodes) => {

          console.log(mutatedNodes)
          console.log(Array.from(mutatedNodes.keys())[0])

          console.log('TextNode listener')
          const editorState = editor.getEditorState()
          editorState.read(() => {
            const root = $getRoot()
            const rootParagraph = root.getFirstChildOrThrow()
            const allText = rootParagraph.getTextContent()
            getLanguageToolOutput(allText)
          })
        })

      return () => {
        // removeTextContentListener();
        removeMutationListener();
        removeMutationListenerTwo();
      }

    }, [])

    return null

}

SikandAlex commented 1 year ago

If anyone can think of an appropriate NodeTransform or MutationListener approach and can avoid the difficulty in recomputing the caret location I would be greatly appreciative, buy you coffee. Thanks for the open source project.

Sorry the code is messy I'll try to clean it up over this weekend.

SikandAlex commented 1 year ago

I'm beginning to wonder whether this will only work if I use the registerTextContentListener (because that is technically exactly what I want to listen for) and ensure that the new text content in the container is the exact same as previously. Then, I can register a NodeTransform on the ParagraphNode that contains both my custom node types and have it update without triggering an infinite loop.

SikandAlex commented 1 year ago

Cleaned up example attempt at Transforms:


import { useLexicalComposerContext } from '@lexical/react/LexicalComposerContext';
import { LexicalEditor, LexicalNode, ParagraphNode, TextNode, $getRoot, $createParagraphNode, $createTextNode } from 'lexical';
import { useEffect } from 'react';
import { $createCustomNode } from './CustomNode';
import languageToolApiClient from '../../../Managers/LanguageToolApiClient';
import { getConnectedRanges } from './Utils';

export default function CustomNodePlugin() {

const getLanguageToolOutput = (text: string) => {
    return languageToolApiClient.check.checkCreate({
      text: text,
      language: 'en-US'
    })
  }

function customNodeTransform(node: LexicalNode) {

    console.log('ParagraphNode transform executed')

    // Node will be ParagraphNode
    const textContent = node.getTextContent();

    // Update the entire paragraph node 
    editor.update(() => {
        const newParagraphNode = $createParagraphNode()
        const newTextNode = $createTextNode(textContent)
        newParagraphNode.append(newTextNode)
        node.replace(newParagraphNode)
    })

    //
}

  function useCustomNodes(editor: LexicalEditor) {
    useEffect(() => {
      const removeTransform = editor.registerNodeTransform(
        ParagraphNode,
        customNodeTransform,
      );
      return () => {
        removeTransform();
      };
    }, [editor]);
  }

    const [editor] = useLexicalComposerContext();
    useCustomNodes(editor)

    useEffect(() => {
        const removeTextContentListener = editor.registerTextContentListener(
            (textContent) => {
                console.log('Overall text content changed... making LT request')
                getLanguageToolOutput(textContent).then(x => {
                    if (editor) {
                        editor.update(
                            () => {
                                $getRoot()?.getFirstChild()?.markDirty()

                            }
                        )
                    }

                })
            });
        return () => {
          removeTextContentListener();
        }

      }, [editor])

    return null;
  }

SikandAlex commented 1 year ago

Only thing I can think of now is to manually set the EditorState to avoid triggering an update listener if that's even possible or otherwise thwart the default dirty marking.

Or possibly I don't understand how to use the registerLexicalTextEntity function.

SikandAlex commented 1 year ago

I'm closer to a solution with this: Sorry for all the comments I'll clean up this thread later. As I work towards getting this done, I'd recommend that there be some example of something like this or an EditorState that relies on some kind of network request like I'm implementing to a local Docker container running LanguageTool.

import { useLexicalComposerContext } from '@lexical/react/LexicalComposerContext';
import { useLexicalTextEntity } from '@lexical/react/useLexicalTextEntity';
import { CustomNode, $createCustomNode } from './CustomNode';
import { useEffect, useCallback } from 'react';
import { TextNode } from 'lexical';

import languageToolApiClient from '../../../Managers/LanguageToolApiClient';

const getLanguageToolOutput = (text: string) => {
  return languageToolApiClient.check.checkCreate({
    text: text,
    language: 'en-US',
  });
};

export function FinalPlugin(): JSX.Element | null {
  const [editor] = useLexicalComposerContext();

  useEffect(() => {
    const removeTextContentListener = editor.registerTextContentListener(
      (textContent) => {
        // The latest text content of the editor!
        console.log(textContent);
      }
    );
    return () => {
      // Do not forget to unregister the listener when no longer needed!
      removeTextContentListener();
    };
  }, []);

  useEffect(() => {
    if (!editor.hasNodes([CustomNode])) {
      throw new Error('FinalPlugin: CustomNode not registered on editor');
    }
  }, [editor]);

  const createCustomNode = useCallback((textNode: TextNode): CustomNode => {
    return $createCustomNode('testme', textNode.getTextContent());
  }, []);

  const getMatch = useCallback((text: string) => {
    return {
      end: 1,
      start: 0,
    };
  }, []);

  useLexicalTextEntity<CustomNode>(getMatch, CustomNode, createCustomNode);

  return null;
}

SikandAlex commented 1 year ago

Have everything I need I think but I don't know how to return multiple matches using getMatch function to registerLexicalTextEntity. Will have to look at the code here: https://github.com/facebook/lexical/blob/beb75cfff522ebddf95193b28aac74e23d807c12/packages/lexical-text/src/index.ts#L150

It might be possible to split the text into enough individual TextNodes and then apply the getMatch repeatedly through those 3.

SikandAlex commented 1 year ago

https://github.com/facebook/lexical/blob/main/packages/lexical-react/src/LexicalAutoLinkPlugin.ts Looks like this plugin passes multiple matchers.

milaabl commented 1 year ago

Hi @SikandAlex ! Can you share a minimal reproducible example?

SikandAlex commented 1 year ago

@milaabl I hope this doesn't sound rude but the point of my issue is that I can't create a minimal reproducible example. As I've discussed, my earlier approaches cause the browser to go into an infinite loop (you don't want to try to run this). I really do appreciate the help though.

My approach right now is to modify registerLexicalTextEntity to accept a list of ranges instead of a getMatch function since I already know the ranges into the total text content of the editor.

Unfortunately, I am still figuring out how to stop infinite loop in the node transform with the necessary pre-conditions. As soon as I have something semi-working... I'll share it.

thegreatercurve commented 1 year ago

If you want to support inline highlighting of incorrect spellings, possible look at MarkNodes in the playground and how they are using the commenting plugin.

acywatson commented 1 year ago

@zurfyx built this internally and may be able to add to the discussion here.

SikandAlex commented 1 year ago

I think my confusion was in the fact that lexical isn't a flat text editor but rather a hierarchy of nodes unlike another text editor I encountered in the past. I'll have to understand more about traversing the hierarchy. I've temporarily swapped to TipTapbecause a user already wrote a plugin that I was able to leverage but I'm interested in returning to Lexical when I have time to migrate over.

thedjpetersen commented 7 months ago

@zurfyx could you post the example?

taismassaro commented 6 months ago

also interested in the example that was built internally as we will need something like this soon!

robbie-hunt commented 3 months ago

Could also do with an example of correctly implementing this. I am hoping to use Sapling AI with Lexical which can be used as a drop-in replacement for Grammarly

matisszemturis commented 3 months ago

Bump, would be interested to see example of this!

KalanaPerera commented 1 month ago

bump

etrepum commented 1 month ago

I don't think you will have much luck using node transforms or registerLexicalTextEntity for this, those are synchronous and localized and what you're doing is not. Something like registerTextContentListener would be a reasonable approach, the rest of the work mapping those ranges back into the document tree and then making the appropriate transforms to/from your BadSpellingNode (whether that's an element that wraps text or a text subclass).

A naïve approach would be to do a breadth first search from the root to find the node that maps to a given range (using getTextContentSize probably) then you use that to do your node splitting/wrapping. You'd also need to make sure not to re-wrap nodes that are already marked bad, and unwrap nodes that should no longer be marked bad. It might make sense to first build a whole tree of normal and bad leaf nodes with their associated ranges from the current version of the document, but you will need to iteratively update that as you do your mutations since you will be splitting (marking a new node as bad will result in up to 3 nodes from the original 1) or potentially joining text nodes (removing a bad node could collapse up to 3 nodes into 1) as each range is processed.

Lexical, like HTML, is like a DOM tree and not a flat text document so what you're doing is not really natively supported. Algorithmically, without a separate data structure to cache (and properly invalidate) measurements, working with text ranges is not very efficient for that data model. It makes sense that it would not easily support what you're trying to do in the way you're trying to do it. Updating the size of one node must cascade to every node after it in the document. You can sort of work around this by going backwards (starting by updating the range that comes last in the document, so you don't need updated measurements for nodes that occur later in the doc).

busdav commented 6 days ago

+1 - @zurfyx could you post the example you mentioned above? This just came up for a customer of ours, they find this a highly important feature. We'd appreciate any help with this.

jpintoic commented 6 days ago

Also interested in that example. Would be great starting point to integrate tools like Grammarly.

facebook / lexical

Feature: No way to add Grammarly-like contextual spell-check #4103