Amaimersion / google-docs-utils

Utilities for interaction with Google Docs.
https://www.npmjs.com/package/google-docs-utils
MIT License
40 stars 9 forks source link

Google Docs will now use canvas based rendering: this may impact some Chrome extensions #10

Open RobertJGabriel opened 3 years ago

RobertJGabriel commented 3 years ago

Just bring it up as and issue and will be willing to help on any develop to get it ready.

Here is the canvas based example https://docs.google.com/document/d/1N1XaAI4ZlCUHNWJBXJUBFjxSTlsD5XctCz6LB3Calcg/preview

@menicosia @ken107 @bboydflo @Amaimersion @JensPLarsen

Amaimersion commented 2 years ago

It worked for me yesterday, but not now :) I updated the code and included other mentioned solutions. Now it works for me again. Can you try it?

ifnullzero commented 2 years ago

No, still doesn't work for me. BTW, the part where you add mode=html...I think it could be simplified a bit like this:

if (!location.href.includes('mode=html')) {
  const redirect_url = new URL(location.href);
  redirect_url.searchParams.append('mode', 'html');
  location.href = redirect_url.href;
}

Looking at your code...it seems to be trying all the things that have already been tried and that used to work but have stopped working, e.g. adding mode=html, and setting window._docs_force_html_by_ext to be true, or to be equal to a whitelisted extension id (e.g. pebbhcjfokadbgbnlmogdkkaahmamnap).

I don't know why Google have shut down the ability to render as HTML for those that want it. Google Docs looks terrible in FireFox when using Canvas rendering. I guess they're trying to force everyone to move to Chrome. I'm NOT using Chrome, Google!

KaranArora1 commented 2 years ago

That's strange, the code worked for me when I tried it last night @ifnullzero

Edit: I used Chrome

Amaimersion commented 2 years ago

I confirm that this solution doesn't works on Firefox.

ken107 commented 2 years ago

Anyone knows how the Natural Readers Text-to-Speech extension able to not only grab the text but also highlight them on the page? Kind of like Grammarly

mikob commented 2 years ago

Thank you @ken107 for the code. I have also found the cursor start and cursor end positions:

start: window.KX_kixApp.NCa.C.xa.H.qc.C end: window.KX_kixApp.NCa.C.xa.H.Jb.C text: // ['NCa', 'C', 'ac', 'H', 'F']

AasthaGour commented 2 years ago

This function is self-contained:

function le(e, t, n, o=Object.getOwnPropertyNames(e)) {
        const r = new Set
          , i = [];
        let s = 0;
        const a = (o,l,c,u=0)=>{
            if (s++,
            "prototype" === o || l instanceof Window)
                return;
            if (u > n)
                return;
            const d = [...c, o];
            try {
                if (t(o, l))
                    return void i.push({
                        path: d,
                        value: l
                    })
            } catch (e) {}
            var g;
            if (null != l && !r.has(l))
                if (r.add(l),
                Array.isArray(l))
                    l.forEach(((e,t)=>{
                        try {
                            a(t.toString(), e, d, u + 1)
                        } catch (e) {}
                    }
                    ));
                else if (l instanceof Object) {
                    ((g = l) && null !== g && 1 === g.nodeType && "string" == typeof g.nodeName ? Object.getOwnPropertyNames(e).filter((e=>!J.has(e))) : Object.getOwnPropertyNames(l)).forEach((e=>{
                        try {
                            a(e, l[e], d, u + 1)
                        } catch (e) {}
                    }
                    ))
                }
        }
        ;
        return o.forEach((t=>{
            try {
                a(t, e[t], [])
            } catch (e) {}
        }
        )),
        {
            results: i,
            iterations: s
        }
    }

Calling it like this will return the text of the document:

le(window.KX_kixApp, ((e,t)=>t && "\x03" === t.toString().charAt(0)), 5)

Edit: and a de-obfuscated version https://gist.github.com/ken107/2b40c87fcdf27171a5a5fdc489639300

Do we have an approach to insert text programmatically in google docs? It should be able to insert text in the object window.KX_kixApp as well as should render on the canvas also. Do we have a similar function to set text like the above to get text ?

@ken107 @gzomer @Amaimersion Can you help here ?

ken107 commented 2 years ago

@AasthaGour Sorry, I don't know how to do that. But even if we insert it there, it will only change the local state, it won't update the actual Google Doc.

AasthaGour commented 2 years ago

@AasthaGour Sorry, I don't know how to do that. But even if we insert it there, it will only change the local state, it won't update the actual Google Doc.

Is there any way to update the actual google doc? How is grammarly able to achieve that then ?

Keffisor commented 2 years ago

Summarization of above. At the moment this code enables HTML rendering mode. This solution works only on Google Chrome.

manifest.json

{
    "manifest_version": 2,
    "name": "Google Docs DOM",
    "version": "1.0.0",
    "content_scripts": [
        {
            "matches": ["*://docs.google.com/*"],
            "all_frames": false,
            "run_at": "document_start",
            "js": ["content-script.js"]
        }
    ],
    "web_accessible_resources": [
        "injected-script.js"
    ]
}

content-script.js

/**
 * @see
 * https://github.com/Amaimersion/google-docs-utils/issues/10#issuecomment-1086602191
 */
function injectCode() {
    const code = `(function() {window['_docs_force_html_by_ext'] = 'pebbhcjfokadbgbnlmogdkkaahmamnap';})();`;
    const script = document.createElement('script');
    script.textContent = code;
    (document.head || document.documentElement).appendChild(script);
}

function injectScript() {
    const script = document.createElement('script');
    script.type = 'text/javascript';
    script.src = chrome.runtime.getURL('injected-script.js');
    (document.head || document.documentElement).appendChild(script);
}

injectCode();
injectScript();

injected-script.js

/**
 * @see
 * https://github.com/Amaimersion/google-docs-utils/issues/10#issuecomment-1033118583
 */
if (!location.href.includes('mode=html')) {
    if (location.href.includes('?')) {
        location.href = location.href.replace('?', '?mode=html&');
    } else if (location.href.includes('#')) {
        location.href = location.href.replace('#', '?mode=html#');
    } else {
        location.href += '?mode=html';
    }
}

/**
 * @see
 * https://github.com/Amaimersion/google-docs-utils/issues/10#issuecomment-1059671773
 * https://github.com/Amaimersion/google-docs-utils/issues/10#issuecomment-1059671773
 * https://github.com/Amaimersion/google-docs-utils/issues/10#issuecomment-1062588430
 */
function forceHTMLRenderingMode(n) {
    if (window._docs_flag_initialData) {
        window._docs_flag_initialData['kix-awcp'] = true;
    } else if (n > 0) {
        window.setTimeout(forceHTMLRenderingMode.bind(null, n - 1), 0);
    } else {
        console.warn('Could not set kix-awcp flag');
    }
}

forceHTMLRenderingMode(100);

Doesn't work anymore, it has been fixed. Was working before but now it doesn't

richardweb10 commented 2 years ago

Has anyone tried with the MutationObserver? I need to detect the change of selection, and it already does it with this event every time I select, but it does not return the selected text. Has anyone worked with this?

var target = document.querySelector(".docs-texteventtarget-iframe").contentDocument.activeElement;

var observer = new MutationObserver(function(mutations) {
  mutations.forEach(function(mutation) {
    var entry = {
      mutation: mutation,
      el: mutation.target,
      value: mutation.target.textContent,
      oldValue: mutation.oldValue
    };
    console.log('Recording Mutation:', entry);
  });
});

var config = { 
  attributes: false, 
  childList: true, 
  subtree: true };

observer.observe(target, config);
sbahir commented 2 years ago

Summarization of above. At the moment this code enables HTML rendering mode. This solution works only on Google Chrome.

I am able to run this script but the DOM still shows a canvas instead of the full HTML

Is anyone still having issues with this topic?

@Amaimersion Would you know if there is any update I should add to the script?

Thank you everyone for your help! this thread was very helpful

sbahir commented 2 years ago

Has anyone tried with the MutationObserver? I need to detect the change of selection, and it already does it with this event every time I select, but it does not return the selected text. Has anyone worked with this?

@richardweb10 were you able to successfully run the script or are you using a solution of your own?

richardweb10 commented 2 years ago

@sbahir yes, it is another solution because the script to pass to HTML does not work, with the script that I publish the event is detected with the render on canvas, but the text is still missing.

ken107 commented 2 years ago

@richardweb10 you said you could detect the selection event but couldn't get the selection text. Do you think the event contains selection indices instead? I.e. the index where selection starts and ends. And iirc there can be multiple selection ranges, so there maybe multiple sets of indices. Does the mutation observer detect any numeric value changes that look like selection range indices?

richardweb10 commented 2 years ago

@ken107 check the object returned by MutationObserver and I don't see the indices either. The point is that MutatationObserver listens to the changes in the DOM and not the event itself of the user, and according to Google Docs it uses this hidden DOM for the accessibility of the canvas.

luisfrancisco commented 2 years ago

Has anyone found a solution?

ken107 commented 1 year ago

I'm kind of pissed that these extensions (Grammarly, Speechify, Natural Readers) are privy to some kind of trade secret (what the hell is this KIX thing) that regular chumps don't have access to, putting our products at a disadvantage. THese guys not only are able to grab the text, they know the exact position each word is rendered on the page. I don't know if the Google Docs team is making an exception for these guys and provide them with some API, or is it the work of genius-level reverse engineering. I haven't done hardcore reverse-engineering for a long time, but this is leaving me no choice.

ken107 commented 1 year ago

Well, it didn't take much work at all. Turns out they're just grabbing the text from the aria-label's of the svg elements under the element div.kix-canvas-tile-content. So Google could have told us that screen-readers can still grab the document's text from the aria-label's, and save us all this headache.

ken107 commented 1 year ago

Never mind, your extension has to be whitelisted in order for those svg's to appear.

In the file client_js_prod_kix_core.js there is this line:

var UWd, VWd = ["klbcgckkldhdhonijdbnhhaiedfkllef", "pmehocpgjmkenlokgjfkaichfjdhpeol", "egfdjlfmgnehecnclamagfafdccgfndp", "kbfnbcaeplbcioakkpcpgfkobkghlhen", "kioapjjehinomebhipemibkoghnfihaj", "hnjmaaohmpmdoanfjacbknnehbdpjpmm", "dmgakmconcphdjpoocibfobjddhpgnio", "bomfdkbfpdhijjbeoicnfhjbdhncfhig", "dmkgikgpekohnbminnleammbeclbokgh", "djbfakenejacolbjepjhcchdippngdbp", "flkiihnalkdcocknmgjolnbhfaigngfp", "haldlgldplgnggkjaafhelgiaglafanh", "jldbdlmljpigglecmeclifcdhgbjbakk", "bdeoamknceeblbefhiigacemelopdppa", "fcdbkdengaelkbaafmlalbedmjnmjapa", "ifajfiofeifbbhbionejdliodenmecna", "mloajfnmjckfjbeeofcdaecbelnblden", IGa, "hmggmdpaljdnjgmniiclfopnkbalekbp", DIa, "hokifickgkhplphjiodbggjmoafhignh", DIa, "jipdnfibhldikgcjhfnomkfpcebammhp", "ebdggalafggfhdllgohjbpgnmfppdfih", "inoeonmfapjbbkmdafoankkfajkcphgd", "kfkohpkagbjoncihbogfnjnddimfbgea", "dnkpcikcllnnegjghmfookhnokmebgdh", "hmeaghljleeopdieidljimobgoafgapd", "ofdopmlmgifpfkijadehmhjccbefaeec", "hjngolefdpdnooamgdldlkjgmdcmcjnc", "ifjafammaookpiajfbedmacfldaiamgg", LIa, "bnafdpfmpikpcacbelamcajbmjndgklm", "nllcnknpjnininklegdoijpljgdjkijc", "mmdimcomiaceclfekcgibfeobbndbopm", "fgngodlaekdlibajobmkaklibdggemdd", "gccahjgcckaemgpliioopngfgdaceffo", "ilglobbeckliiacnbnohbbefhkoloblo", "fgniddhicadmgnhmjmfkgpaombkdcnnl", "dicejkhgnmmcjacnmljbblebljabbpnn", "mimhhbjnaljkbilneebflkabhdkjimkk", SIa, "ifgehbglgmidafhhdcopacejknmcmhcd", "kofgpmehdapadjondblkpknejimjogdo", "omdkfmonkgjglhnhebgdejoganhkealn", "kkmlkkjojmombglmlpbpapmhcaljjkde", "nffionklkilbohdkhcmkbbkofhkbmggl", "fjplahakcceffamlidjanodojjlollab", "ioadmlabdmgldhncokgjbhlnpalnfccd", "emmgndmggimgmmpciejhnkcnlillakfn", "cbcfbhjolgdaepkoaoepejclfggmdand", "imgmjcnolbhljlnimjbnmeokfljgjokj", "oldceeleldhonbafppcapldpdifcinji", "mnobpfbimfpabjieomgenlbgheebffof", "malgpphelhcnbplnpmmjojeamjlgblgo", "hgkenelkhbkjhgfplpklaeglhkfcpbpm", "kohfgcgbkjodfcfkcackpagifgbcmimk", SOa, "pnmaklegiibbioifkmfkgpfnmdehdfan", "dleiijbbjajmfcaiiiadgjpgfjmfdfen", "pbcmminlefcpnleomamgjlabgcijheok", SIa, SOa, "nfphpgdlkoindbccbdmilopgeafllemo", "npnbdojkgkbcdfdjlfdmplppdphlhhcf", "faafcilbiobkabcjekkhbapcmpdngflm", "hjhgdemcnjlhekapcjegdlckfhbmfokc", "jhlhihioaipeflkdimdbkadpodbnniel", "jcgeopnonhjnagebgakgpffmkohefbdh", "glcfdkchgkofkofecbnfgbidokhlppmf", "kljjoeapehcmaphfcjkmbhkinoaopdnd", "afmmlbahiciolpoignokabegahakmong", "klfljjhmnlljannanocjnolaeghdgmga", "bjghlclfmfhiiglogafdcgojddlmlhge", "edckiomfoomhpfloijngdaicnnjjddpp", "epmjcijfcpnhbglfecjdngdhgiobhceb", "jmhaloheodngkciligplhbkgjnpgfhoo", "nfcdcdoegfnidkeldipgmhbabmndlhbf", "iidnbdjijdkbmajdffnidomddglmieko", "agooankbbcndkpbgjkfkghjdpjapdkni", "fajobjleimkjndphlimdddblbcgfpicl", "acjdnhnkcpdloclhchkgpmhkbkehpfae", "mlijekmilcpnmnlljpmbhodoaegpebaf", "jnhblfnklpfpbbadlfbikhhifadinfho", "bhohkajhbibngfndckklbimjhemlcdee", "ndgklmlnheedegipcohgcbjhhgddendc", "miaccnfadhkleebjbbcfdegablmghlmd", "immpagioobbjjlckfakmgpnbhheamcbi", IGa, "fglnjifiablfjkhmigndpnnkbfpndopd", LIa, "fonaoompfjljjllgccccgjnhnoghohgc", "ahmapmilbkfamljbpgphfndeemhnajme", "oacindbdmlbdeidohafnfocfckkhjlbg", "bodpobmeklikangclldacmaihakldeci", "fmmimelabnggnokpjboglilcinoblbmg", "kgcdgjmildkboglkjlmllmkchhibgbcc", "oklpcimghhhhanifldcdlfgoaigfiolj", "ogmnaimimemjmbakcfefmnahgdfhfami", "dpleknbhldhgjfehhpiepebadcgchdnm", "fhbilfohhdpfibkpooeifedhmanfgnel", "bdgaaabbgamcfglelfokfdfafajbidgm", "cbpkkmggbeflamfgbajljndjlnefkbag", "ljflmlehinmoeknoonhibbjpldiijjmm", "ofkedepgdiahoncldakmfelppdholdgm", "pcgmedbmchghfgikplcimdmfldfnecec", "gjchalenkmdagfflgkbdmnbgknlgggkc", "kaffbmdkkagajhknnbgkpgineiohpoph", "giidlnpcdhcldhfccdhkaicefhpokghc", "ljmfilijpodcbjfejfpmgljhononafjh", "clfoagbcljppiogigjclfjnjpcijnijl", "emklnclnhhcpfpnlddggeehicebnefej", "hahkojdegblcccihngmgndhdfheheofe", "idhgjpneaihehiedachegphhllbfcdoo", "flacajjkcnlngpeapmnmignmjgddamah", "ifgbgjpgapejnchlkfofdhoeknmdgpen", "flgigbncjmhflchbichmphangjilklfg", "dpiohehecdbcaadncddknojhiamnlgda", "kchdekblgicnbcenmagdnfpeclchaeon", "ddlbpiadoechcolndfeaonajmngmhblj", "ehlmgkidbfjedfdanfechlikaolobpgh", "llfifnhohhnmdojjdllnpmnljfnfkobo", "naofaoalandhmapmbecoaajgmkkdeedg", "jdoccfgelddlaopeahgpoffmafnmhpdo", "mddofjlammcffeigngeniglccdbfgnhh", "hbhcjiciiehhgenkinjhbopeilhknmak", "ikiphelkpdmcdabenkdlpbomampmcjjh", "bkkbkbmghjjkejaekiefdbefgpkoomhe", "lnnpddnljcmlahaajlokohgkdkinlebj", "jodcbfnodmldokkbfonjmojeipffgknl", "hhnjkanigjoiglnlopahbbjdbfhkndjk"], WWd = "pebbhcjfokadbgbnlmogdkkaahmamnap lbfjopbdnlacmcdochehdolkcipncehm bknnlbamapndemiekhkcnmdclnkijlhb mchfohhlgkjmomgcblaebjldcdcfddod ahpgjaondafacdmkhdkpfndbblpafgjo hakgmeclhiipohohmoghhmbjlicdnbbb hopjidpebkocjhmmhkjmgblipnonklin obamedcdehgbcknllmpfmkjboadmcngk kkokhmpamjfkaobkhofabjmflebofofm piobbnjelpnbnafleaibbfbnnmibnpjh plaeniloeifmajgdcaonhdnolpfjfhdg pbnaomcgbfiofkfobmlhmdobjchjkphi gokfmgbabgmfmnnpikikekganhmlioga ojmdhelldoflcpmkjejlkjlaghpblmcm kbapnajlciffffomeaphfpckfdcfopef".split(" ");

And the extension needs to inject this script to activate the annotations:

window._docs_annotate_canvas_by_ext="clfoagbcljppiogigjclfjnjpcijnijl"
RobertJGabriel commented 1 year ago

@ken107 just noticed one of my open source projects is in that list. Can't seem to get it to trigger the change.

Will keep at it and keep you posted. I never got any email about being added either.

ken107 commented 1 year ago

It looks like any extension can inject the above script, using any of the extension IDs in that list, to enable the annotations. Google Docs doesn't appear to validate it.

But the injection must occur before any of google docs scripts are executed. It seems the only way to do that is via the MV3-exclusive scripting.registerContentScript({runat: "document_start", world: "MAIN"}). The "world: MAIN" is a not-yet-documented feature, meaning this workaround is a relatively new development.

I discover two other methods for grabbing the text of the document. One is by parsing the script tags in the document.body, the ones that begins with DOCS_modelChunk =. However, this method does not get updates when the user edits the doc.

The other method is by programmatically opening the "Tools / Accessibility" menu and toggling the "Enable screen-reader support" checkbox. This requires simulating real mouse clicks comprising of mousedown and mouseup events. I don't yet know how to grab the document text, but apparently screen readers like JAWS, ChromeVox, or the extension "Screen Reader" are able to do it. Someone who works on Accessibility software would probably know how this works.

RobertJGabriel commented 1 year ago

I was about to mention the main part. Its under as mainWorld It beifely mentioned here and is a v3 feature https://developer.chrome.com/docs/extensions/reference/scripting/

RobertJGabriel commented 1 year ago

Got it to work. @ken107

In the background script with scripting permission.

So need to be manifest v3.

const GOOGLE_DOCS_SCRIPT = {
            id: "gdocs-white-listing",
            matches: ["*://docs.google.com/document/*"],
            runAt: "document_start",
            world: "MAIN",
            js: ["scripts/docs.js"],
        };
        try {
            if (typeof browser !== "undefined") {
                await browser.scripting.registerContentScripts([GOOGLE_DOCS_SCRIPT]);
            } else {
                await   chrome.scripting.registerContentScripts([GOOGLE_DOCS_SCRIPT]);
            }
        } catch (err) {
            console.error(`failed to register content scripts: ${err}`);
        }
tylertaewook commented 1 year ago

@RobertJGabriel sorry, this discussion is really hard to follow as someone who just found this repo/issue; Does this mean this google-docs-utils repo is still a relevant way to grab text from google docs?

Do we need to add that manifest v3 specific scripting permission in order for this library to function correctly?

In case it might help to anyone working on this issue, I found this repo that's basically Grammarly that works in google-doc; but I still couldn't figure out how they are grabbing these information from google doc canvas:

https://github.com/languagetool-org/languagetool

Tej-Sharma commented 1 year ago

I tried looking into Canvas modifying libraries such as FabricJS and ConcreteJS, but their main purpose seems to be for modifying Canvas files, not for reading them. Though I wonder if they could be used for reading in Canvases too...

Really need this for a school project and wondering if anyone had any other insights?

P.S. I found a SO thread and figured out how to insert text! It retains it after refresh too

 const KEYS = ['H', 'e', 'l', 'l', 'o'];

  for (let i = 0; i < KEYS.length; i++) {
    const KEY = KEYS[i];

    const keyEvent = document.createEvent('Event')
    keyEvent.initEvent('keypress', true, true)
    keyEvent.key = KEY // A key like 'a' or 'B' or 'Backspace'
    // You will need to change this line if you want to use other special characters such as the left and right arrows
    keyEvent.keyCode = KEY.charCodeAt(0) 
    document.querySelector('.docs-texteventtarget-iframe')
      .contentDocument.activeElement
      .dispatchEvent(keyEvent);
  }
luisfrancisco commented 1 year ago

Got it to work. @ken107

In the background script with scripting permission.

So need to be manifest v3.

const GOOGLE_DOCS_SCRIPT = {
          id: "gdocs-white-listing",
          matches: ["*://docs.google.com/document/*"],
          runAt: "document_start",
          world: "MAIN",
          js: ["scripts/docs.js"],
      };
      try {
          if (typeof browser !== "undefined") {
              await browser.scripting.registerContentScripts([GOOGLE_DOCS_SCRIPT]);
          } else {
              await   chrome.scripting.registerContentScripts([GOOGLE_DOCS_SCRIPT]);
          }
      } catch (err) {
          console.error(`failed to register content scripts: ${err}`);
      }

I've been trying to find a solution for months i really hope you can help... I added the scripting permission to Manifest v3 and now this code should be added to background.js? What is this scripts/docs.js you are linking to in the constant?

Tej-Sharma commented 1 year ago

@luisfrancisco I believe this is what docs.js is (it's just a file with this line of code):

// adds a variable that enables annotatiosn which can be read
window._docs_annotate_canvas_by_ext="clfoagbcljppiogigjclfjnjpcijnijl"

Also besides that, I put this at the top of my content script:

window._docs_annotate_canvas_by_ext="clfoagbcljppiogigjclfjnjpcijnijl"

And I have an injected script that has this in it:

// Google Docs topmost additions
!function forceHtmlRenderingMode() {
  if (window._docs_flag_initialData) {
      window._docs_flag_initialData['kix-awcp'] = true;
  } else {
      setTimeout(forceHtmlRenderingMode, 0);
  }
}();
// ['pebbhcjfokadbgbnlmogdkkaahmamnap', 'lbfjopbdnlacmcdochehdolkcipncehm', 'bknnlbamapndemiekhkcnmdclnkijlhb', 'mchfohhlgkjmomgcblaebjldcdcfddod', 'ahpgjaondafacdmkhdkpfndbblpafgjo', 'hakgmeclhiipohohmoghhmbjlicdnbbb', 'hopjidpebkocjhmmhkjmgblipnonklin', 'obamedcdehgbcknllmpfmkjboadmcngk', 'kkokhmpamjfkaobkhofabjmflebofofm', 'piobbnjelpnbnafleaibbfbnnmibnpjh', 'plaeniloeifmajgdcaonhdnolpfjfhdg', 'pbnaomcgbfiofkfobmlhmdobjchjkphi']
(function() {window['_docs_force_html_by_ext'] = 'pebbhcjfokadbgbnlmogdkkaahmamnap';})();

All taken from the above people's threads, so thank you for that

Also I saw Ken's open source project has a somewhat working selection correctly. It uses the same approach I was going for which is basically reading the aria labels

luisfrancisco commented 1 year ago

@Tej-Sharma did you get it to work? I either get this error after adding @RobertJGabriel script to background.js

Uncaught SyntaxError: await is only valid in async functions and the top level bodies of modules

and if i remove the await content-script.js throws this error:

Refused to execute inline script because it violates the following Content Security Policy directive: "script-src 'self' 'wasm-unsafe-eval'". Either the 'unsafe-inline' keyword, a hash ('sha256-OY2nJwbaQSGtqxa4bV1O2h4pcwoQ4TbhFxi3uL6yRjc='), or a nonce ('nonce-...') is required to enable inline execution.

Tej-Sharma commented 1 year ago

@luisfrancisco Yes. What's your background.js and contentscript looking like? You didn't need a script inside the content script. Also another thing you can try is if Grammarly is installed, it gets enabled too.

luisfrancisco commented 1 year ago

@Tej-Sharma this is what i have:

content-script.js:

window._docs_annotate_canvas_by_ext="clfoagbcljppiogigjclfjnjpcijnijl"

function injectCode() {

    const code = `(function() {window['_docs_force_html_by_ext'] = 'pebbhcjfokadbgbnlmogdkkaahmamnap';})();`;
    const script = document.createElement('script');
    script.textContent = code;
    (document.head || document.documentElement).appendChild(script);
    console.log("code injected")
}

function injectScript() {
    const script = document.createElement('script');
    script.type = 'text/javascript';
    script.src = chrome.runtime.getURL('injected-script.js');
    (document.head || document.documentElement).appendChild(script);
    console.log("inject script")
}

injectCode();
injectScript();

injected-script.js:

if (!location.href.includes('mode=html')) {
    if (location.href.includes('?')) {
        location.href = location.href.replace('?', '?mode=html&');
    } else if (location.href.includes('#')) {
        location.href = location.href.replace('#', '?mode=html#');
    } else {
        location.href += '?mode=html';
    }
}

// Google Docs topmost additions
!function forceHtmlRenderingMode() {
    if (window._docs_flag_initialData) {
        window._docs_flag_initialData['kix-awcp'] = true;
    } else {
        setTimeout(forceHtmlRenderingMode, 0);
    }
  }();
  // ['pebbhcjfokadbgbnlmogdkkaahmamnap', 'lbfjopbdnlacmcdochehdolkcipncehm', 'bknnlbamapndemiekhkcnmdclnkijlhb', 'mchfohhlgkjmomgcblaebjldcdcfddod', 'ahpgjaondafacdmkhdkpfndbblpafgjo', 'hakgmeclhiipohohmoghhmbjlicdnbbb', 'hopjidpebkocjhmmhkjmgblipnonklin', 'obamedcdehgbcknllmpfmkjboadmcngk', 'kkokhmpamjfkaobkhofabjmflebofofm', 'piobbnjelpnbnafleaibbfbnnmibnpjh', 'plaeniloeifmajgdcaonhdnolpfjfhdg', 'pbnaomcgbfiofkfobmlhmdobjchjkphi']
  (function() {window['_docs_force_html_by_ext'] = 'pebbhcjfokadbgbnlmogdkkaahmamnap';
    console.log(window._docs_force_html_by_ext)})();

docs.js:

/ adds a variable that enables annotations which can be read
window._docs_annotate_canvas_by_ext="clfoagbcljppiogigjclfjnjpcijnijl"
console.log("annotated")

background.js


(() => {
    console.log('start background script')
    const GOOGLE_DOCS_SCRIPT = {
        id: "gdocs-white-listing",
        matches: ["*://docs.google.com/document/*"],
        runAt: "document_start",
        world: "MAIN",
        js: ["docs.js"],
    };
    try {
        if (typeof browser !== "undefined") {
         await browser.scripting.registerContentScripts([GOOGLE_DOCS_SCRIPT]);
        } else {

        await chrome.scripting.registerContentScripts([GOOGLE_DOCS_SCRIPT]);
        console.log("registered script")
        }
    } catch (err) {
        console.error(`failed to register content scripts: ${err}`);
    }
    "use strict";
    const e = {
        extensionInstallDate: "",
        extensionUpdateDate: "",
        buildInfoTarget: "chrome",
        buildInfoIsProduction: !0,
        buildInfoNodeEnv: "production",
        buildInfoCommitHash: "1a135d0950d215c922d6582d11c584cc479db67a",
        buildInfoBuildDate: "2023-01-09T19:12:52.056Z",
        userAccessToken: null,
        userData: {},
        userPersonalData: {},
        generationApiMaxChecks: 15,
        generationApiCheckInterval: 1e3,
        generationApiTooManyAttemptsSleep: 3e3,
        generationApiServiceUnavailableSleep: 3e3,
        generationApiUnexpectedErrorSleep: 3e3,
        emulatedUserLogin: "emulated@emulated",
        emulatedUserPassword: "emulated",
        emulatedUserAccessToken: "emulated",
        emulatedUserData: {},
        emulatedUserPersonalData: {},
        emulatedGenerationApiStartData: {},
        emulatedGenerationApiRequestSleep: 1500,
        emulatedGenerationApiResultCount: 2,
        textSelectionHandlerOn: !0,
    };
    function t(e) {
        return new Promise((t) => {
            chrome.storage.local.get(e, (e) => t(e));
        });
    }
    function n(e) {
        return new Promise((t) => {
            chrome.storage.local.set(e, () => t());
        });
    }
    async function a(e = null) {
        const t = e || new Date().toISOString();
        return await n({ extensionUpdateDate: t }), t;
    }
    async function o() {
        try {
            await (async function () {
                await new Promise((e) => {
                    chrome.storage.local.clear(() => e());
                }),
                    await n(e);
            })();
        } catch (e) {
            throw (alert("Please reinstall the extension."), e);
        }
        try {
            const e = await (async function () {
                const e = new Date().toISOString();
                return await n({ extensionInstallDate: e }), e;
            })();
            await a(e);
        } catch (e) {
            console.error(e);
        }
        try {
            chrome.tabs.create({
                url: "https://unbounce.com/sc-ext-thank-you/",
            });
        } catch (e) {
            console.error(e);
        }
    }
    function c(e) {
        const t = e ? "on" : "off";
        chrome.action.setIcon({
            path: {
                16: `static/logo/${t}/16.png`,
                32: `static/logo/${t}/32.png`,
                48: `static/logo/${t}/48.png`,
                128: `static/logo/${t}/128.png`,
            },
        });
    }
    chrome.runtime.onInstalled.addListener(async (c) => {
        switch (c.reason) {
            case "install":
                await o();
                break;
            case "update":
                await (async function () {
                    try {
                        chrome.tabs.create({
                            url: "https://unbounce.com/sc-ext-updated/",
                        });
                    } catch (e) {
                        console.error(e);
                    }
                    try {
                        await (async function () {
                            const a = await t(null);
                            for (const t in e) {
                                if (Object.prototype.hasOwnProperty.call(a, t))
                                    continue;
                                const o = e[t];
                                await n({ [t]: o });
                            }
                        })();
                    } catch (e) {
                        console.error(e);
                    }
                    try {
                        await a();
                    } catch (e) {
                        console.error(e);
                    }
                    try {
                        await (async function () {
                            const a = [
                                    "buildInfoPlatform",
                                    "buildInfoIsProduction",
                                    "buildInfoNodeEnv",
                                    "buildInfoCommitHash",
                                    "buildInfoBuildDate",
                                ],
                                o = await t(null),
                                c = {};
                            for (const t in o) {
                                if (!Object.prototype.hasOwnProperty.call(o, t))
                                    continue;
                                if (-1 === a.indexOf(t)) continue;
                                const n = e[t];
                                n !== o[t] && (c[t] = n);
                            }
                            Object.keys(c).length && (await n(c));
                        })();
                    } catch (e) {
                        console.error(e);
                    }
                })();
        }
    }),
        t("textSelectionHandlerOn")
            .then((e) => {
                const t = e.textSelectionHandlerOn;
                c(null == t || t);
            })
            .catch((e) => {
                console.error(e);
            }),
        chrome.storage.onChanged.addListener((e) => {
            for (const [t, n] of Object.entries(e))
                "textSelectionHandlerOn" === t && c(n.newValue);
        });
})();

and my manifest.js has

   "content_scripts": [
        {
            "matches": ["<all_urls>"],
            "all_frames": true,
            "js": ["vendor.js", "text-selection-handler.js"],
            "run_at": "document_end"
        },
        {
            "all_frames": false,
            "js": ["content-script.js"],
            "matches": ["*://docs.google.com/document/*"],
            "run_at": "document_start"
        }
    ],
    "background": { "service_worker": "background.js" },
    "web_accessible_resources": [
        {
            "resources": [
                "text-selection-handler.css",
                "text-selection-handler.js",
                "injected-script.js",
                "*.svg"
            ],
            "matches": ["<all_urls>"]
        }
    ],
    "permissions": ["tabs", "storage", "contextMenus", "scripting"],
    "content_security_policy": {
        "extension-pages": "script-src 'self' 'wasm-unsafe-eval'; img-src 'self'; object-src 'self"
    },
    "offline_enabled": false,
    "minimum_chrome_version": "63"
Tej-Sharma commented 1 year ago

@luisfrancisco Hmm I can't tell what's going wrong, but here's my background.js + content script + manifest.json, you could compare mine and yours to see if there's a difference.

// The content script

scripts/contentscript.js


window._docs_annotate_canvas_by_ext = "clfoagbcljppiogigjclfjnjpcijnijl";

// scripts/script.js
var s = document.createElement("script");
s.src = chrome.runtime.getURL("scripts/script.js");
s.onload = function () {
  this.remove();
};
(document.head || document.documentElement).appendChild(s);

script.js

// Google Docs topmost additions
!(function forceHtmlRenderingMode() {
  if (window._docs_flag_initialData) {
    window._docs_flag_initialData["kix-awcp"] = true;
  } else {
    setTimeout(forceHtmlRenderingMode, 0);
  }
})();
// ['pebbhcjfokadbgbnlmogdkkaahmamnap', 'lbfjopbdnlacmcdochehdolkcipncehm', 'bknnlbamapndemiekhkcnmdclnkijlhb', 'mchfohhlgkjmomgcblaebjldcdcfddod', 'ahpgjaondafacdmkhdkpfndbblpafgjo', 'hakgmeclhiipohohmoghhmbjlicdnbbb', 'hopjidpebkocjhmmhkjmgblipnonklin', 'obamedcdehgbcknllmpfmkjboadmcngk', 'kkokhmpamjfkaobkhofabjmflebofofm', 'piobbnjelpnbnafleaibbfbnnmibnpjh', 'plaeniloeifmajgdcaonhdnolpfjfhdg', 'pbnaomcgbfiofkfobmlhmdobjchjkphi']
(function () {
  window["_docs_force_html_by_ext"] = "pebbhcjfokadbgbnlmogdkkaahmamnap";
})();

scripts/background.js


/*global chrome*/

const GOOGLE_DOCS_SCRIPT = {
  id: "penora-gdocs-white-listing",
  matches: ["*://docs.google.com/document/*"],
  runAt: "document_start",
  world: "MAIN",
  js: ["scripts/docs.js"],
};
try {
  if (typeof browser !== "undefined") {
    browser.scripting.registerContentScripts([GOOGLE_DOCS_SCRIPT]);
  } else {
    console.log(chrome.scripting);
    chrome.scripting.registerContentScripts([GOOGLE_DOCS_SCRIPT]);
  }
} catch (err) {
  console.error(`failed to register content scripts: ${err}`);
}

scripts/docs.js

// adds a variable that enables annotatiosn which can be read
window._docs_annotate_canvas_by_ext = "clfoagbcljppiogigjclfjnjpcijnijl";

Manifest.json:

{
  "name": "xxx",
  "manifest_version": 3,
  "description": "xxx",
  "version": "1.0",
  "action": {
    "default_popup": "index.html",
    "default_title": "Open the popup"
  },
  "icons": {
    "16": "logo.png",
    "48": "logo.png",
    "128": "logo.png"
  },
  "host_permissions": ["<all_urls>"],
  "background": {
    "service_worker": "scripts/background.js",
    "type": "module"
  },
  "permissions": ["tabs", "scripting", "activeTab", "identity"],
  "web_accessible_resources": [
    {
      "resources": [
        "style/fontawesome.min.css",
        "style/regular.min.css",
        "style/solid.min.css",
        "scripts/script.js",
        "webfonts/*",
        "scripts/jquery-3.1.1.min.js",
        "scripts/injected-script.js",
        "logo.png"
      ],
      "matches": ["<all_urls>", "*://docs.google.com/*"]
    }
  ],
  "content_scripts": [
    {
      "matches": ["<all_urls>", "https://docs.google.com/document/d/*"],
      "css": [
        "style/fontawesome.min.css",
        "style/regular.min.css",
        "style/solid.min.css",
        "style/penora-ai-style.css"
      ],
      "js": ["scripts/contentscript.js"]
    },
    {
      "matches": ["*://docs.google.com/*"],
      "all_frames": false,
      "run_at": "document_start",
      "css": ["style/penora-ai-style.css"],
      "js": ["scripts/docs-contentscript.js"]
    }
  ],
  "externally_connectable": {
    "matches": ["<all_urls>"]
  },
}

The end result is that the HTML tree will contain content like so: image

Hope this helps!

tylertaewook commented 1 year ago

@Tej-Sharma Thank you so much for all your help. What does your scripts/docs-contentscript.js look like so far? I'm getting confused with these similar script names and directory which I think it's one of the reasons why I'm still stuck.

Do you mind creating and sharing the repo of the version that works so we can see exactly how files are structured? In theory, using the content script, background, and manifest allow us to see the HTML tree without the need of grammarly or any other third-party library?

Lastly, what are these tokens and where did they come from?

luisfrancisco commented 1 year ago

Yeah no luck here @Tej-Sharma :/ At this point I hope @Amaimersion sees this thread and help us pointing how to implement your fix somehow

I second @tylertaewook... it seems that it might be something how the code is structured, if you could share a repo with a version of your working code that would help a lot.

redacuve commented 1 year ago

Hi @Tej-Sharma Even with annotate canvas this library doesn't work anymore right?

redacuve commented 1 year ago

@tylertaewook i was able to use annotate canvas with:

manifest.json
        "content_scripts": [
...
      {
        "matches": ["*://docs.google.com/*"],
        "all_frames": false,
        "run_at": "document_start",
        "js": ["js/docs-contentscript.js"]
      }
...
  ],
  "web_accessible_resources": [
    {
      "resources": ["js/docs-contentscript.js", "js/docs-canvas.js"],
      "matches": ["<all_urls>"]
    }
  ],
....
  "permissions": [
    "storage", "scripting"
  ],
...
docs-contentscript.js
var s = document.createElement("script");
s.src = chrome.runtime.getURL("js/docs-canvas.js");
s.onload = function () {
  this.remove();
};
(document.head || document.documentElement).appendChild(s);
docs-canvas.js
window._docs_annotate_canvas_by_ext = "clfoagbcljppiogigjclfjnjpcijnijl";
dan-online commented 1 year ago

The main issue about this, is that these are temporary solutions that once discovered by the google docs team, will be patched or changed again 😓

kant01ne commented 1 year ago

Has anyone been using one of the IDs from that list in prod already? Or even better, do you know how to get whitelisted?

lukesanborn commented 1 year ago

I've been following the thread and its getting hard to keep track of what works. Is there any solution that is not a temporary hack and functions like Grammarly?

RafaOstrovskiy commented 1 year ago

@kant01ne https://stackoverflow.com/a/71321036

lionelhorn commented 1 year ago

For any one interested, I made a list of the whitelisted ids with most downloads. https://gist.github.com/lionelhorn/1b2d8aa26fa222d2a6803c246640696b

Here is a sample

url title count
https://chrome.google.com/webstore/detail/kbfnbcaeplbcioakkpcpgfkobkghlhen Grammarly: Grammar Checker and Writing App 10000000
https://chrome.google.com/webstore/detail/inoeonmfapjbbkmdafoankkfajkcphgd Read&Write for Google Chromeâ„¢ 10000000
https://chrome.google.com/webstore/detail/hjngolefdpdnooamgdldlkjgmdcmcjnc Equatio - Math made digital 4000000
https://chrome.google.com/webstore/detail/hdhinadidafjejdhmfkjgnolgimiaplp Read Aloud: A Text to Speech Voice Reader 4000000
https://chrome.google.com/webstore/detail/mloajfnmjckfjbeeofcdaecbelnblden Snap&Read 3000000
https://chrome.google.com/webstore/detail/ifajfiofeifbbhbionejdliodenmecna Co:Writer 3000000
https://chrome.google.com/webstore/detail/nllcnknpjnininklegdoijpljgdjkijc Wordtune - AI-powered Writing Companion 2000000
RobertJGabriel commented 1 year ago

Just for anyone who applys to get white listed. Mine got approved but I was never notified of it.

grnqrtr commented 1 year ago

I'm not exactly sure what this repo is for, but I stumbled upon it trying to figure out how to force google docs into html render mode, and I ended up giving the code from @dbjpanda above to ChatGPT and it helped me make this userscript that can be run with Tampermonkey to solve the issue!

Just in case it helps anyone else that stumbles here:

// ==UserScript==
// @name         Google Docs Script Injector
// @namespace    http://tampermonkey.net/
// @version      1.0
// @description  Inject a script into Google Docs
// @match        *://docs.google.com/document/*
// @run-at       document-start
// @grant        none
// ==/UserScript==

(function() {
    'use strict';

    // Your extension ID that is whitelisted by Google
    const extensionId = "npnbdojkgkbcdfdjlfdmplppdphlhhcf";

    // Inject the script into the page
    const script = document.createElement('script');
    script.innerHTML = `
        (() => {
            window._docs_annotate_canvas_by_ext = "${extensionId}";
        })();
    `;
    document.documentElement.appendChild(script);
})();
dbjpanda commented 1 year ago

I finally able to achieve it and built a chrome extension that works on Google Docs https://chrome.google.com/webstore/detail/chatgpt-chrome-extension/gabfffndnhbbjlgmbogpfaciajcbpkdn

Georg7 commented 1 year ago

I finally able to achieve it and built a chrome extension that works on Google Docs https://chrome.google.com/webstore/detail/chatgpt-chrome-extension/gabfffndnhbbjlgmbogpfaciajcbpkdn

Can you share how you got this library to work?

Georg7 commented 1 year ago

Anyone willing to share how they're able to get this library to work. @dbjpanda @lionelhorn

I'm able to get docs to render the svgs. Can access a few things like getEditorElement and getCursorElement.

However, I'm not able to get the selected text. I've tried:

Doesn't trigger anything when text is selected

document.addEventListener("selectionchange", function (event) {
  console.log("selectionchange", event);
});

GoogleDocsUtils.getSelection(); returns an empty array

const linesData = GoogleDocsUtils.getSelection();
let selectionData = null;

for (const lineData of linesData) {
  if (lineData) {
    selectionData = lineData;
    break;
  }
}

if (selectionData) {
  console.log("linesData", selectionData.selectedText);
}
jakobsturm commented 1 year ago

I am trying to use a whitelisted ID but can not see any difference in how the canvas is rendered. Could it be that Google implemented a fix that compares the extension ID with the ID in the code?

Also how long does it take to get whitelisted?

eliChing commented 1 year ago

I don't think changing google doc rendering mode will work now, cause after checking the nodes with some grammar check extensions like 'Grammarly' and 'LanguageTool', google doc is still rendering by canvas.

image

These two nodes include information about the text position and selection position(just position) in Google Docs. you can get the entire text by splice the value of the 'aria-label' attribute However, I have not yet found a way to get the user-selected text or to highlight specific text based on string index.

ElijZhang commented 1 year ago

@RobertJGabriel Hello, could you please tell me how long does it take to get whitelisted? I applied it two days ago, but my extension id is not in whitelist yet.

komms commented 1 year ago

This function is self-contained:

function le(e, t, n, o=Object.getOwnPropertyNames(e)) {
        const r = new Set
          , i = [];
        let s = 0;
        const a = (o,l,c,u=0)=>{
            if (s++,
            "prototype" === o || l instanceof Window)
                return;
            if (u > n)
                return;
            const d = [...c, o];
            try {
                if (t(o, l))
                    return void i.push({
                        path: d,
                        value: l
                    })
            } catch (e) {}
            var g;
            if (null != l && !r.has(l))
                if (r.add(l),
                Array.isArray(l))
                    l.forEach(((e,t)=>{
                        try {
                            a(t.toString(), e, d, u + 1)
                        } catch (e) {}
                    }
                    ));
                else if (l instanceof Object) {
                    ((g = l) && null !== g && 1 === g.nodeType && "string" == typeof g.nodeName ? Object.getOwnPropertyNames(e).filter((e=>!J.has(e))) : Object.getOwnPropertyNames(l)).forEach((e=>{
                        try {
                            a(e, l[e], d, u + 1)
                        } catch (e) {}
                    }
                    ))
                }
        }
        ;
        return o.forEach((t=>{
            try {
                a(t, e[t], [])
            } catch (e) {}
        }
        )),
        {
            results: i,
            iterations: s
        }
    }

Calling it like this will return the text of the document:

le(window.KX_kixApp, ((e,t)=>t && "\x03" === t.toString().charAt(0)), 5)

Edit: and a de-obfuscated version https://gist.github.com/ken107/2b40c87fcdf27171a5a5fdc489639300

The link is not working. Would love to see the code in the link.