Closed andreas-schoch closed 5 months ago
Just ran into this with my app. Big ➕ to the request to get a document's text independent of the way revisions are being shown in the doc. Our flow is as follows:
context.document.body.text
)context.document.body.search(...)
and store their rangesUnfortunately, whilst the return value of context.document.body.text
depends on the way revisions are being shown in the doc, context.document.body.search
doesn't. As a result, if the user was viewing markup in balloons when the extraction (1) occurred, and there was markup in one of the things we subsequently want to search for (3) we have a mismatch between the doc and what we're searching for.
I've noticed that context.document.body.text.getReviewedText()
is independent of the view, which is great, but unfortunately it's not the text that context.document.body.search(...)
uses (which appears to be the text that context.document.body.text
returns if displaying revisions inline?).
Update:
context.document.body.text.getReviewedText()
is probably the way forward. You can specify whether you want the "original" text (before any revisions are applied) or the "current" text (as if all revisions had been accepted)search(...)
- it scans the body's text as though all text where present, so "hello~goodbye~ my old friend" will appear as "hellogoodbye my old friend", and searching for "hello my old friend" won't find itAny updates on this? This bug is absolutely killing us
We can't use getReviewedText because it freezes and causes severe document scrolling, and we can't rely on paragraph.text because of this Show Revisions Inline issue...
@andreas-schoch can you please see if body.getReviewedText() can solve your problem?
@chad-levesque , we are working on getReviewedText issues (freezes document and scrolling). We will use other issues to track them.
@greysteil Can you open a separate issue for search() if you think it is still a problem for you?
@greysteil Can you open a separate issue for search() if you think it is still a problem for you?
There's a separate issue with
search(...)
- it scans the body's text as though all text where present, so "hello~goodbye~ my old friend" will appear as "hellogoodbye my old friend", and searching for "hello my old friend" won't find it
I no longer think of this as an issue. It's not super intuitive, but it matches the way search in Word works for end users, so I don't think it's an OfficeJS issue. I also now think the present behaviour of context.document.body.text
is helpful, as it implicitly reveals what display mode a user is in, and provides the text to search for. (My original comment about the text to search for was wrong - context.document.body.text
is mostly the base text for searching over.)
Going back to the initial problem, how can one search for text ignoring the track changes?
I am analysing the document text in the Backend and I get back sections of text which I need to highlight (using content controls). In order to find the ranges / places where the content control needs to be created, I do a search by the text I receive from the BE and it's not found by Word, because the new text doesn't include the deleted parts, but Word does include them.
How can I go around this?
This issue has been automatically marked as stale because it is marked as needing author feedback but has not had any activity for 4 days. It will be closed if no further activity occurs within 3 days of this comment. Thank you for your interest in Office Add-ins!
This issue has been closed due to inactivity. Please comment if you still need assistance and we'll re-open the issue.
User settings influence what
context.document.body.text
returns in a Word add-in.Your Environment
Expected behavior
No matter whether user has tracked changes displayed as inline or balloons,
context.document.body.text
(or another exposed property which can be loaded) should return the same text which includes deletions/insertions.Current behavior
When user selects
Show Markup --> Balloons --> Show Revisions in Balloons
, revisions are not reflected in body.text When user selectsShow Markup --> Balloons --> Show All Revisions Inline
, revisions are reflected in body.text.Steps to reproduce
Console.log the value of
context.document.body.text
and compare the difference whenShow Markup --> Balloons --> Show Revisions in Balloons vs. Show All Revisions Inline
Context
I am unsure whether this inconsistency is intentional or a bug, but it makes working with the document content unnecessarily difficult.
As a workaround we now have to parse the Ooxml into a string ourselves which has quite an overhead.
body.getOoxml()
was about ~6x slower compared tobody.text
when I last compared it on a windows desktop.At a minimum I would have expected the office-js api to either:
textWithRevisions
) which can be loaded instead oftext
Apologies in case I missed something and it already is possible to get the insertions/deletions as text without having to parse the ooxml. Please let me know if there is another way 🙏