Open arwagner opened 7 years ago
I'd love to get some guidance on a proper approach to fixing this issue.
I've been playing around with creating a custom matcher for clipboard to do this. The matcher essentially ignores any "p.MsoListParagraphCxSpMiddle" or "p.MsoListParagraphCxSpLast" tags (returns a Delta that doesn't do anything), and, for any "p.MsoListParagraphCxSpFirst" iterates through that tag, and its siblings, until it finds the "...SpLast" tag.
But, from there, I'm not sure exactly what the right thing to do is. The deltas that you get from creating a bullet list manually in quill are kind of strange, and I'm not sure that the matcher should be creating them from scratch? Should it be creating deltas from a List blot? I'm a bit confused as to whether or not I'm even on the right track.
How you looked at how the officially supported matchers in the clipboard work? It uses its own API the same way a third party would. If so what are specific things you have tried and have not gotten to work?
Yes, I've looked at the built-in matchers. http://codepen.io/anon/pen/ENqRdP is what I have so far, but I'm not sure what should go in the "addNodeToDelta" function. None of the built-in matchers quite seem to do what I'm trying to do here, unless I'm misunderstanding them.
The purpose of a matcher is to return a Delta representing a given node. If you fulfill this contract, the clipboard can build a Delta for the entire pasted tree. By traversing siblings and attempting to return Deltas for them instead, you are not fulfilling this contract. I would also suggest taking a look at Delta documentation. One of the more important takeaways from the Delta docs is not to create them by hand.
Yes, I did read the Delta documentation. But I think, in this case, what I want to do is to construct a single delta with a List blot embedded, which corresponds to all the paragraphs that correspond to bullets. Is that not correct? You say that the contract is a one-to-one correspondence between deltas and nodes, yet there are a number of times in https://github.com/quilljs/quill/blob/develop/modules/clipboard.js where previousSibling, nextSibling, etc. are called. What I can't find an example of is a matcher which results in a particular blot.
return { ops: [] }
is constructing a Delta by hand. When Quill's clipboard is using sibling, it does so for context about the current Delta.
@arwagner did you had any luck with this issue?
So after a few hours of works, I have a solution. IMHO it is probably not the most elegant, but it works for unordered lists pasted from MS Word. Unfortunately, it does not work for ordered lists (any hints why), the implementation seems the same as for unordered lists.
const MSwordMatcher = function (node, delta) {
const _build = [];
while (true) {
if (node) {
if (node.tagName === 'P') {
const content = node.querySelectorAll('span'); //[0] index contains bullet or numbers, [1] index contains spaces, [2] index contains item content
const _nodeText = content[2].innerText.trim();
//const _listType = content[0].innerText.match(/[0-9]/g) ? 'ordered' : 'bullet'; //@TODO: implement ordered lists
_build.push({ insert: `${_nodeText}\n`, attributes: { 'bullet': true } });
if (node.className === 'MsoListParagraphCxSpLast') {
break;
}
}
}
node = node.nextSibling;
}
return new Delta(_build);
};
const matcherNoop = (node, delta) => ({ ops: [] });
While initing quill
modules: {
clipboard: {
matchers: [
['p.MsoListParagraphCxSpFirst', MSwordMatcher],
['p.MsoListParagraphCxSpMiddle', matcherNoop],
['p.MsoListParagraphCxSpLast', matcherNoop],
]
},
}
ping @arwagner (if you are still interested)
I tried to take the example from @DavidReinberger and apply the feedback from @jhchen on this issue. I wanted to preserve bullet vs ordered, indentation as well as allow HTML within each list item. Any feedback / suggestions are welcome.
Note: I am using underscore in the below code, but that could be removed.
const MSWORD_MATCHERS = [
['p.MsoListParagraphCxSpFirst', matchMsWordList],
['p.MsoListParagraphCxSpMiddle', matchMsWordList],
['p.MsoListParagraphCxSpLast', matchMsWordList],
];
function matchMsWordList(node, delta) {
// Clone the operations
let ops = _.map(delta.ops, _.clone);
// Trim the front of the first op to remove the bullet/number
let first = _.first(ops);
first.insert = first.insert.trimLeft();
let firstMatch = first.insert.match(/^(\S+)\s+/);
if (!firstMatch) return delta;
first.insert = first.insert.substring(firstMatch[0].length, first.insert.length);
// Trim the newline off the last op
let last = _.last(ops);
last.insert = last.insert.substring(0, last.insert.length - 1);
// Determine the list type
let prefix = firstMatch[1];
let listType = prefix.match(/\S+\./) ? 'ordered' : 'bullet';
// Determine the list indent
let style = node.getAttribute('style').replace(/\n+/g, '')
let levelMatch = style.match(/level(\d+)/);
let indent = levelMatch ? levelMatch[1] - 1 : 0;
// Add the list attribute
ops.push({insert: '\n', attributes: {list: listType, indent}})
return new Delta(ops);
}
Thanks @SamDuvall, your matchers are working flawlessly for me.
Note: I am using underscore in the below code, but that could be removed.
@SamDuvall what's that underscore and what do you mean it can be removed? 😶
@Azuf He's talking about underscore.js library
Here's a vanilla version
function matchMsWordList(node, delta) {
// Clone the operations
let ops = delta.ops.map((op) => Object.assign({}, op));
// Trim the front of the first op to remove the bullet/number
let first = ops[0];
first.insert = first.insert.trimLeft();
let firstMatch = first.insert.match(/^(\S+)\s+/);
if (!firstMatch) return delta;
first.insert = first.insert.substring(firstMatch[0].length, first.insert.length);
// Trim the newline off the last op
let last = ops[ops.length-1];
last.insert = last.insert.substring(0, last.insert.length - 1);
// Determine the list type
let prefix = firstMatch[1];
let listType = prefix.match(/\S+\./) ? 'ordered' : 'bullet';
// Determine the list indent
let style = node.getAttribute('style').replace(/\n+/g, '')
let levelMatch = style.match(/level(\d+)/);
let indent = levelMatch ? levelMatch[1] - 1 : 0;
// Add the list attribute
ops.push({insert: '\n', attributes: {list: listType, indent}})
return new Delta(ops);
}
Copying and pasting unordered bullets from Word puts a bullet symbol in the editor instead of an actual bullet.
Steps for Reproduction
- Open https://www.dropbox.com/s/61gwc7evz398xki/test.docx?dl=0 in Word
- Select all in Word, and copy
- Visit http://quilljs.com/playground/#autosave
- Click into the editor and paste
- Click on the word "One"
- Click on the "unordered bullets" icon in the toolbar of the editor
Expected behavior: The bullet gets removed
Actual behavior: A real bullet gets created, containing the bullet symbol from the clipboard
Platforms:
All
Version: All
Have you solve this issue ? @arwagner can you please help I have to delivery ASAP to customer but i could not able to find the solution anywhere ?
@darshak369 there are a couple of solutions listed above in this issue
@darshak369 there are a couple of solutions listed above in this issue
Thanks for reply @Subtletree I have tried all of them not working anything for me If you can please give the idea what to do in this causing the formatting issue only on MS word desktop app only
@darshak369 Hmm sounds frustrating that they are not working!
The following is working for me:
function matchMsWordList(node, delta) {
// Clone the operations
let ops = delta.ops.map((op) => Object.assign({}, op));
// Trim the front of the first op to remove the bullet/number
let bulletOp = ops.find((op) => op.insert && op.insert.trim().length);
if (!bulletOp) { return delta }
bulletOp.insert = bulletOp.insert.trimLeft();
let listPrefix = bulletOp.insert.match(/^.*(^·|\.)/) || bulletOp.insert[0];
bulletOp.insert = bulletOp.insert.substring(listPrefix[0].length, bulletOp.insert.length);
// Trim the newline off the last op
let last = ops[ops.length-1];
last.insert = last.insert.substring(0, last.insert.length - 1);
// Determine the list type
let listType = listPrefix[0].length === 1 ? 'bullet' : 'ordered';
// Determine the list indent
let style = node.getAttribute('style').replace(/\n+/g, '');
let levelMatch = style.match(/level(\d+)/);
let indent = levelMatch ? levelMatch[1] - 1 : 0;
// Add the list attribute
ops.push({insert: '\n', attributes: {list: listType, indent}})
return new Delta(ops);
}
const MSWORD_MATCHERS = [
['p.MsoListParagraphCxSpFirst', matchMsWordList],
['p.MsoListParagraphCxSpMiddle', matchMsWordList],
['p.MsoListParagraphCxSpLast', matchMsWordList],
['p.msolistparagraph', matchMsWordList]
];
// When instantiating a quill editor
let quill = new Quill('#editor', {
modules: {
clipboard: { matchers: MSWORD_MATCHERS }
}
});
When writing this up I found a couple of edge cases that didn't work, so the above should now work for lists with only one bullet and won't strip the first word from each bullet in some cases.
Word
Pasted into quill
@darshak369 Hmm sounds frustrating that they are not working!
The following is working for me:
function matchMsWordList(node, delta) { // Clone the operations let ops = delta.ops.map((op) => Object.assign({}, op)); // Trim the front of the first op to remove the bullet/number let bulletOp = ops.find((op) => op.insert && op.insert.trim().length); if (!bulletOp) { return delta } bulletOp.insert = bulletOp.insert.trimLeft(); let listPrefix = bulletOp.insert.match(/^.*(^·|\.)/) || bulletOp.insert[0]; bulletOp.insert = bulletOp.insert.substring(listPrefix[0].length, bulletOp.insert.length); // Trim the newline off the last op let last = ops[ops.length-1]; last.insert = last.insert.substring(0, last.insert.length - 1); // Determine the list type let listType = listPrefix[0].length === 1 ? 'bullet' : 'ordered'; // Determine the list indent let style = node.getAttribute('style').replace(/\n+/g, ''); let levelMatch = style.match(/level(\d+)/); let indent = levelMatch ? levelMatch[1] - 1 : 0; // Add the list attribute ops.push({insert: '\n', attributes: {list: listType, indent}}) return new Delta(ops); } const MSWORD_MATCHERS = [ ['p.MsoListParagraphCxSpFirst', matchMsWordList], ['p.MsoListParagraphCxSpMiddle', matchMsWordList], ['p.MsoListParagraphCxSpLast', matchMsWordList], ['p.msolistparagraph', matchMsWordList] ]; // When instantiating a quill editor let quill = new Quill('#editor', { modules: { clipboard: { matchers: MSWORD_MATCHERS } } });
When writing this up I found a couple of edge cases that didn't work, so the above should now work for lists with only one bullet and won't strip the first word from each bullet in some cases.
Word
Pasted into quill
Thanks for your solution @Subtletree Its means a lot.
I had tried this solution in quill playground but unfortunately its not working...
This is playground code which I have copy and paste similar to what you have mention above. considering the screen-sorts you have mentioned it seems like solution is absolutely correct. and its working fine from your side.
https://codepen.io/darshak434/pen/GRMjvwr
The MS word file which I am copying content :
You can find file here - https://1drv.ms/w/s!AtzwzPKX4hPigSpGIzLT2ezREQiL?e=cUkyth
Open with MS word desktop app and copy the content and paste to the above quill editor.
After copy and paste this word content I am getting following result -
can you please share with me all specification you were using like name of the version, operating system and all. so that I can able to understand was it happening to my system only.
Thanks
Looks like I don't have permissions to download the word doc. I tested from another word doc into the codepen and it worked ok on:
Windows 10 21H1 Office 365 Word 2111 Chrome 96.0.4664.93 Firefox 95
Wonder if it's to do with the specific type of bullets or something, let me know when you've changed those permissions and I'll try with your doc!
Looks like I don't have permissions to download the word doc. I tested from another word doc into the codepen and it worked ok on:
Windows 10 21H1 Office 365 Word 2111 Chrome 96.0.4664.93 Firefox 95
Wonder if it's to do with the specific type of bullets or something, let me know when you've changed those permissions and I'll try with your doc!
Hey @Subtletree
Here is the link of doc file you can download directly going to the link -
https://drive.google.com/drive/folders/1txcKIDmrT6tjerPrqy_8THbSaETHrj0f?usp=sharing
Here is the case - I have tested from another new word doc by writing the bullets points and its working fine for me as well. yet if we copy content from doc provided by customer to quill Its not formatted in same manner.
you can check I have share the doc file to you.
Thanks
Looks like those bullets are nested as a p.MsoNormal class for some reason instead of p.MsoListParagraph etc.
The following works but I haven't done heaps of testing with it. It's possibly quite brittle e.g with a non standard bullet (like arrows) in a p.MsoNormal, the list won't be detected.
const Delta = Quill.import('delta');
function matchMsWordList(node, delta) {
// Clone the operations
let ops = delta.ops.map((op) => Object.assign({}, op));
// Trim the front of the first op to remove the bullet/number
let bulletOp = ops.find((op) => op.insert && op.insert.trim().length);
if (!bulletOp) { return delta }
bulletOp.insert = bulletOp.insert.trimLeft();
let listPrefix = bulletOp.insert.match(/^.*?(^·|\.)/) || bulletOp.insert[0];
bulletOp.insert = bulletOp.insert.substring(listPrefix[0].length, bulletOp.insert.length).trimLeft();
// Trim the newline off the last op
let last = ops[ops.length-1];
last.insert = last.insert.substring(0, last.insert.length - 1);
// Determine the list type
let listType = listPrefix[0].length === 1 ? 'bullet' : 'ordered';
// Determine the list indent
let style = node.getAttribute('style').replace(/\n+/g, '');
let levelMatch = style.match(/level(\d+)/);
let indent = levelMatch ? levelMatch[1] - 1 : 0;
// Add the list attribute
ops.push({insert: '\n', attributes: {list: listType, indent}})
return new Delta(ops);
}
function maybeMatchMsWordList(node, delta) {
if (delta.ops[0].insert.trimLeft()[0] === '·') {
return matchMsWordList(node, delta);
}
return delta;
}
const MSWORD_MATCHERS = [
['p.MsoListParagraphCxSpFirst', matchMsWordList],
['p.MsoListParagraphCxSpMiddle', matchMsWordList],
['p.MsoListParagraphCxSpLast', matchMsWordList],
['p.MsoListParagraph', matchMsWordList],
['p.msolistparagraph', matchMsWordList],
['p.MsoNormal', maybeMatchMsWordList]
];
// When instantiating a quill editor
let quill = new Quill('#editor', {
modules: {
clipboard: { matchers: MSWORD_MATCHERS }
},
placeholder: 'Compose an epic...',
theme: 'snow'
});
MsoListParagraphCxSpLast
Thanks @Subtletree for your effort and time Its working fine but as you said it is quite brittle such as spaces before the bullets paragraph , not retain spaces between lines etc. you can find this document to the same link - https://drive.google.com/drive/folders/1txcKIDmrT6tjerPrqy_8THbSaETHrj0f?usp=sharing
Is there any solution to retain that as well like similar to p.MsNormal you did in above function ?
It will be very helpful to my customer.
@darshak369 I've edited my last comment to add a trimLeft
on this line:
bulletOp.insert = bulletOp.insert.substring(listPrefix[0].length, bulletOp.insert.length).trimLeft();
Which should fix the spacing at the start (but will also strip any intentional spacing at the start)
The paragraph spacing issue has nothing to do with the bullets really. I think quill doesn't handle before and after paragraph spacing so would need to be handled in a custom way. If you just added a new line after each paragraph instead of using paragraph spacing it would work fine but hard to tell your customer that 😅
Thank you very much for this solution @Subtletree I am very glad. everything is work as expected and it means a lot. 👍
The paragraph spacing issue is not a big issue that should be fine without it. but yes it definitely hard to tell customer 😂
Very welcome @darshak369! I've updated our code to use the new changes so has helped me too.
Thank you very much for your work!
If I paste my word-list to https://codepen.io/darshak434/pen/GRMjvwr?editors=1111 I get the correct result and the correct p-classes.
<p class="MsoListParagraphCxSpMiddle" style="margin: 0cm 0cm 0cm 216pt; font-size: 12pt; font-family: Calibri, sans-serif; text-indent: -18pt;"><span style="font-size: 24pt; font-family: Wingdings;">§<span style="font-variant-numeric: normal; font-variant-east-asian: normal; font-stretch: normal; font-size: 7pt; line-height: normal; font-family: "Times New Roman";"> </span></span><span style="font-size: 24pt;">Value <o:p></o:p></span></p>
In my context (with ngx-quill) I get the following node, if i do a console.log(node);
in my matcher-methode:
<p><span style="font-size:24.0pt;font-family:Wingdings;mso-fareast-font-family:Wingdings; mso-bidi-font-family:Wingdings"><span style="mso-list:Ignore">§<span style="font:7.0pt "Times New Roman""> </span></span></span><span style="font-size:24.0pt">Value<span style="mso-tab-count:1"> </span></span></p>
What can be the reason for this difference? I paste the same content but I get different nodes (and therefor different deltas) in the matcher-methods?
Hi @Subtletree I am new to angular. Can you please help me where to paste the piece of code which you have shared and how to make it work?. I tried pasting it in the app.component.ts where I have written the code for quill functionality by changing a few things to adapt it to .ts like adding this. and removing const.
It gave me zero errors and compiled it, but still, it is just pasting the bulletins from MS Word with   without adding
Please help me out. It is critical for me.
Hi @darshak369. I am trying to implement the Quill Editor in Angular. I have implemented Quill editor in the App component itself. I have copied the code by @Subtletree to the app.component.ts and made necessary changes to suit TypeScript.
It is getting complied successfully, but the issue of ordered/unordered list getting created when pasting bulletins from MS Word still exists. Need your help on how to make this work please. It's very critical for my work.
Hi there,
I know I'm following up on a long running thread. This is a major pain for my work as well. I'm curious, is this bug open because no one has been able to devote time to it, or because it doesn't seem to have a feasible solution?
Thanks!
Hey!
I think even if we created a proper PR for this fix it probably wouldn't be merged and released as it seems quill is mostly abandoned? https://github.com/quilljs/quill/issues/3521 https://github.com/quilljs/quill/issues/3359
The code above has fixed the bug in my environment but it seems like the nodes copied from word can vary in other environments. Can't know for sure but if someone put time into finding out why then I think a solution would be feasible.
Thank you @Subtletree This works https://github.com/quilljs/quill/issues/1225#issuecomment-992267444
Copying and pasting unordered bullets from Word puts a bullet symbol in the editor instead of an actual bullet.
Steps for Reproduction
Expected behavior: The bullet gets removed
Actual behavior: A real bullet gets created, containing the bullet symbol from the clipboard
Platforms:
All
Version: All