ueberdosis / tiptap

The headless rich text editor framework for web artisans.
https://tiptap.dev
MIT License
27.13k stars 2.26k forks source link

Ordered\Unordered list pasted from word looses list format #1526

Open Katttori opened 3 years ago

Katttori commented 3 years ago

Description When i try to paste ordered or unordered list from MS Word it pastes like paragraphs, not like list (without ul\ol) Do i need to add something to editor or it is a bug?

Steps to reproduce the bug Steps to reproduce the behavior:

  1. Go to MS Word
  2. Create list
  3. Copy it to editor https://www.tiptap.dev/
  4. See lines with spaces

Screenshot, video, or GIF Screenshot 2021-06-30 163147

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Benny739 commented 2 years ago

Any news or known workarounds here?

les-stockton commented 2 years ago

this is an issue for me too

thismax commented 2 years ago

Bumping this, it's an issue I'm experiencing as well.

gethari commented 1 year ago

I'm trying to mimic what has been done in czi-prosemirror , but I'm having a hard time to match the implementation. Any known workarounds or pointers would be helpful

miladd3 commented 1 year ago

Still a problem and a big one that make people go to other traditional editors

bdbch commented 1 year ago

I think the last time we looked into Word / Google Docs formatting we ran into the problem that those editors don't translate well over to Tiptap / Prosemirror.

We will look into Word and Docs formatting in the future.

dcourv commented 9 months ago

I found that the editor handled pasting lists from Google Docs just fine, but Word lists were creating problems. So I created the following extension specifically for pasting lists from Word. I tested it and it should work for nested lists, multiple lists broken up by non-list paragraphs in between, unordered and ordered lists. I only tried it on Windows so I would be curious if it works for people on Mac.

@bdbch is this a feature Tiptap wants to be built into the editor itself? In the list extension? (which one?) If so, I can create a PR for this.

const MsWordPasteHandler = Extension.create({
  name: 'msWordPasteHandler',
  addProseMirrorPlugins() {
    return [
      new Plugin({
        props: {
          transformPastedHTML(html) {
            const parser = new DOMParser()
            const doc = parser.parseFromString(html, 'text/html')

            const msoListParagraphs = doc.querySelectorAll('.MsoListParagraph, .MsoListParagraphCxSpFirst, .MsoListParagraphCxSpMiddle, .MsoListParagraphCxSpLast')

            if (msoListParagraphs.length) {
              let listStack = []
              let listType = 'ul'

              msoListParagraphs.forEach(p => {
                if (p.classList.contains('MsoListParagraphCxSpFirst')) {
                  listStack = []

                  const listMarkerText = p.querySelector('span > span').textContent

                  listType = (/^\d+\.?/.test(listMarkerText)) ? 'ol' : 'ul'
                }

                const styleString = p.getAttribute('style')
                const matches = styleString.match(/mso-list:[^;]*level(\d+)/)
                const level = parseInt(matches[1], 10)

                while (level > listStack.length) {
                  const newList = doc.createElement(listType)

                  if (listStack.length > 0) {
                    listStack[listStack.length - 1].appendChild(newList)
                  }
                  listStack.push(newList);
                }

                while (level < listStack.length) {
                  listStack.pop()
                }

                p.childNodes.forEach(node => {
                  if (node.nodeType === Node.TEXT_NODE && node.textContent.trim() !== '') {
                    const li = doc.createElement('li')

                    li.innerHTML = node.textContent.trim()
                    listStack[listStack.length - 1].appendChild(li)
                    p.parentNode.replaceChild(listStack[0], p)
                  }
                })
              })
            }

            return doc.body.innerHTML
          },
        },
      }),
    ]
  },
})
bastianjoel commented 7 months ago

We found that it is not sufficient to just rely on the MsoList... classes to identify a list. In some cases lists can only be identified via the mso-list style attribute.

A more complete solution which also does parsing of list start and ordered list type can be found here: https://github.com/OpenSlides/openslides-client/blob/1c3ad1e91976dbd00e7476f907b0e9d8ca8fe283/client/src/app/ui/modules/editor/components/editor/extensions/office.ts

Note that you need to extend the ordered lists in order to use list start/list count type as such: https://github.com/OpenSlides/openslides-client/blob/1c3ad1e91976dbd00e7476f907b0e9d8ca8fe283/client/src/app/ui/modules/editor/components/editor/extensions/ordered-list.ts