HtmlUnit / htmlunit

HtmlUnit is a "GUI-Less browser for Java programs".
https://www.htmlunit.org
Apache License 2.0
875 stars 172 forks source link

Unable to run Html containing Canvas in headless mode using HtmlUnit? #140

Closed nites67 closed 3 years ago

nites67 commented 4 years ago

I am using htmlunit 2.35.0 version to run the html in headless mode. It is working fine with html containing SVG. Now, I am using a custom JavaScript framework called geotoolkit which renders canvas images. I am facing issue when I try to run the html with canvas in headless mode using htmlunit. Please find the below code and error logs. Can anyone please let me know how to fix the issue ? I have also raised the issue on stackoverflow. Here is the link

https://stackoverflow.com/questions/60372112/how-to-use-htmlunit-to-run-html-containing-canvas-in-headless-mode

import java.io.File;
import java.nio.file.Paths;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlPage;

String path = Paths.get("Input/Editor").toAbsolutePath()+File.separator+"canvas.html";
WebClient webClient = new WebClient();
File file = new File(path);
HtmlPage page = webClient.getPage(file.toURI().toURL().toString());         
webClient.getOptions().setJavaScriptEnabled(true);
webClient.waitForBackgroundJavaScript(10000);                       
System.out.println(page.asXml());
webClient.close();
Feb 24, 2020 11:22:29 AM com.gargoylesoftware.htmlunit.javascript.host.canvas.CanvasRenderingContext2D createImageData
    INFO: CanvasRenderingContext2D.createImageData() not yet implemented
    Feb 24, 2020 11:22:29 AM com.gargoylesoftware.htmlunit.javascript.DefaultJavaScriptErrorListener scriptException
    SEVERE: Error during JavaScript execution
    ======= EXCEPTION START ========
    EcmaError: lineNumber=[1426] column=[0] lineSource=[null] name=[TypeError] sourceName=[file:/D:/Playground/HeadlessTest/Input/BHAEditor/geotoolkit/geotoolkit.adv.js] message=[TypeError: Cannot read property "width" from undefined (file:/D:/Playground/HeadlessTest/Input/BHAEditor/geotoolkit/geotoolkit.adv.js#1426)]
    com.gargoylesoftware.htmlunit.ScriptException: TypeError: Cannot read property "width" from undefined (file:/D:/Playground/HeadlessTest/Input/BHAEditor/geotoolkit/geotoolkit.adv.js#1426)
        at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:885)
        at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:617)
        at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:534)
rbri commented 4 years ago

working on fixing this

rbri commented 4 years ago

Have done many updates for the ImageData handling. Will make a new snapshot build available soon (check https://twitter.com/HtmlUnit).

Please try your sample with the new snapshot and report your results.

nites67 commented 4 years ago

Thanks for the update. Once the build is released, I will check and update you :)

nites67 commented 4 years ago

Hi, Can you please let me know if the Snapshot is available from maven central ?

rbri commented 4 years ago

Snapshots are available from sonatype staging

You can add this to your pom

<snapshotRepository>
  <id>sonatype-nexus-snapshots</id>
  <url>https://oss.sonatype.org/content/repositories/snapshots</url>
</snapshotRepository>
nites67 commented 4 years ago

Thanks for the update. Is this the version 2.38.0-SNAPSHOT which i should use ?

rbri commented 4 years ago

Yes please

nites67 commented 4 years ago

Thanks for the update. I will check and let you know

nites67 commented 4 years ago

Hi, I took the latest updates which you created i did not get any errors when i ran the code but i get these below info messages. Can you please advise ?

com.gargoylesoftware.htmlunit.javascript.host.canvas.CanvasRenderingContext2D SetLineDash
INFO : CanvasRenderingContext2D.setLineDash() not yet implemented.
com.gargoylesoftware.htmlunit.javascript.host.canvas.rendering.AwtRenderingBackend setFillStyle
INFO : Cannot find color 'lightblue'
com.gargoylesoftware.htmlunit.javascript.host.canvas.rendering.AwtRenderingBackend  setStrokeStyle
INFO : Cannot find color 'grey'
rbri commented 4 years ago

Ok, at least the color stuff is fixable. Do you need the dash stuff also?

nites67 commented 4 years ago

Hi, Yes I would need it . In Addition, It is also giving the below error com.gargoylesoftware.htmlunit.javascript.host.canvas.CanvasRenderingContext2D clip INFO : CanvasRenderingContext2D.clip() not yet implemented.

I am generating the canvas in html and converting it to base 64 image using HTMLUnit. I see that when i run the same html in a browser, i get to see the clear image as the base 64 data is properly generated. But when i execute the html using HTMLUnit, the canvas get generated but the base 64 generated from canvas is not the same which in turn results in corrupted image. I suspect it is because of these methods like clip, setLineDash , fillstyle and colors which are yet to be implemented. Can you please advise ?

rbri commented 4 years ago

Any chance for me to reproduce your problem here. Can you attach your code or maybe send it via private mail?

rbri commented 4 years ago

Ok, color name handling for fillStyle is fixed

rbri commented 4 years ago

CanvasRenderingContext2D.strokeStyle also fixed

rbri commented 4 years ago

Clip impl is on the way - a bit more complicated. But way to test your code here will be more efficient....

rbri commented 4 years ago

Basic clip impl is also done. Will make a new snapshot build soon.

nites67 commented 4 years ago

Thank you so much for the support. Please let me know the version of the new snapshot. I will share the code within a day or two so that it will be easy for you to fix the things.

rbri commented 4 years ago

Am 6. März 2020 14:55:44 MEZ schrieb nites67 notifications@github.com:

Thank you so much for the support. Please let me know the version of the new snapshot. I will share the code within a day or two so that it will be easy for you to fix the things.

-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/HtmlUnit/htmlunit/issues/140#issuecomment-595778140

You are welcome. Will be great if you can share your code as soon as possible because I like to do a release this weekend.

nites67 commented 4 years ago

Hi, Please find the POC canvas project attached here from the below link along with decryption key. As I am not authorized to share the exact code due to some security policies in my company, i have created a sample POC with a complex Canvas here. You can notice the base 64 string is not correct here when executed.

https://mega.nz/#!1UV3UC6Q

Decryption key : y6sP89ETpC8KPdIs2m7S41cslGgCCv5ncY9EDDVENOU

rbri commented 4 years ago

Snapshot build is out - will try to find some time to test your stuff here. Thanks for the support.

nites67 commented 4 years ago

Thanks for the update. Can you please share the version number of the Snapshot?

rbri commented 4 years ago

2.38.0-SNAPSHOT

sonatype-nexus-snapshots https://oss.sonatype.org/content/repositories/snapshots
rbri commented 4 years ago

Will have a look at your stuff later - looks interesting

rbri commented 4 years ago

Ok, have done a first look. There is a bit more to do to get this working.

Will do this during the next days but mainly after the release - because some peoples are already waiting for the release. Hope this is ok for you....

nites67 commented 4 years ago

Ok thanks for the update . I will wait for the release.

rbri commented 4 years ago

What an endless story. Your sample now works with this code

    String filePath = "file:\\\\C:\\Users\\ronald\\Desktop\\htmlunit\\canvastest\\Input\\Canvas.html";

    try (final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_68)) {
        webClient.getOptions().setJavaScriptEnabled(true);
        webClient.getOptions().setCssEnabled(true);
        webClient.getOptions().setThrowExceptionOnScriptError(false);
        webClient.waitForBackgroundJavaScript(10000);
        String js = "var canvas = document.getElementById('c');\r\n" +
                "       var image = new Image();\r\n" +
                "       image.id=\"canvasImage\";\r\n" +
                "       image.setAttribute('crossorigin', 'anonymous');\r\n" +
                "       image.src = canvas.toDataURL('image/png');\r\n" +
                "       document.body.appendChild(image);";
        HtmlPage page = webClient.getPage(filePath);
        webClient.waitForBackgroundJavaScript(10000);

        Window window = page.getEnclosingWindow().getScriptableObject();
        int i = 0;
        do {
            window.animateAnimationsFrames();
            i++;
        } while (i < 29);
        page.executeJavaScript(js);

        System.out.println("result: " + page.getElementById("canvasImage").getAttribute("src"));
    }

Please mention: because HtmlUnit is headless you are responsible to trigger the animationFrame event from your code. You can check the jdoc for more details.

Will have a look at the LineDash stuff now.

nites67 commented 4 years ago

Great.. Thanks for the support. Is there any change in the build ? Do i have to take the latest snapshot?

rbri commented 4 years ago

give me some minutes, i will deploy a new snapshot - you need the hsl color support for getting something different than black :-)

rbri commented 4 years ago

seems like the LineDash stuff requires some work - will do this after the release (maybe next week).

rbri commented 4 years ago

Snapshot is available

nites67 commented 4 years ago

Thank you for the support.. I will check and update you if all the things are fine !! I think i will need LineDash method implementation.

rbri commented 4 years ago

But please do a check with the new release, i hope the color stuff will be a step forward. The line dash will definitely require some time.

nites67 commented 4 years ago

Ok. I will definitely check and update you by tomorrow morning. Thank you again for the support.

nites67 commented 4 years ago

Hi, I checked the latest version of html unit and it is not working for me. Please find the below findings and errors. The base 64 generated from the canvas is not correct. When i render the base 64 image in html using the latest version of the snapshot , i get nothing except the letter MD from the screenshot. Screenshot is available here https://mega.nz/#!pNs3kQZJ!6HhbxBB4Buvs4lqGwb2_FBpO_xqVwb83G4NY36lpqm0

com.gargoylesoftware.htmlunit.javascript.host.canvas.CanvasRenderingContext2D SetLineDash
INFO : CanvasRenderingContext2D.setLineDash() not yet implemented.
com.gargoylesoftware.htmlunit.javascript.host.canvas.rendering.AwtRenderingBackend extractColor
INFO : Cannot find color ['objectCanvasGradient']

Surprisingly, The version previous to the latest one which i worked on it last week had given me better results. Please find the attachment below of the screenshot of the image rendered with base 64 for the version from last week. https://mega.nz/#!pNs3kQZJ!6HhbxBB4Buvs4lqGwb2_FBpO_xqVwb83G4NY36lpqm0

Please find the screenshot of my actual canvas which i am working on here https://mega.nz/#!sYtRkaDS!dUSmfnVm8_z67CQanI7Ym9dUD6jJo8lsco48NngP0s0

rbri commented 4 years ago

The first and the second image are the same and the last one looks really like something else.

nites67 commented 4 years ago

Hi, Yes the first and second image is same. With the latest version, i am getting only the text MD from the 1st/2nd image. I am expecting the base 64 to render like the last image. The version which i used last week gave me some result. I hope you are clear now.

nites67 commented 4 years ago

Hi, Can you please share your mail address ? I managed to get the trial version of the actual code and made it run. I have to share it with you so that it will be easy for you to tackle the issue.

rbri commented 4 years ago

Am 12. März 2020 07:07:57 MEZ schrieb nites67 notifications@github.com:

Hi, Can you please share your mail address ? I managed to get the trial version of the actual code. I have to share it with you so that it will be easy for you to tackle the issue.

-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/HtmlUnit/htmlunit/issues/140#issuecomment-598022849

Please have a look at the list of contributors on the HtmlUnit homepage.

nites67 commented 4 years ago

Hi, Yes I got it. I have shared the project link over email for you to download. Kindly request you to have a look at it once. Thank you so much for the support. Highly appreciate it.

nites67 commented 4 years ago

Hi, Hope you are doing good. did you get time to check the canvas ?

rbri commented 4 years ago

Have done another round of updates fixing wrong transformation handling. You can try with the latest snapshot if you like to see some progress. For further analysis i need the not-obfuscated version of all the js (library) code in your sample.

Doing this analysis is very time consuming. Any chance you can help me with this?

nites67 commented 4 years ago

Hi,Thanks for the update. I took the latest snapshot and tried to run the application. I was not able to view the image using the base64 which was generated from canvas. Can you please let me know if i have to do some changes in the code to view the image ?

In Addition, Can you please elaborate what do you mean by not-obfuscated version of all the JS (library) code ?

rbri commented 4 years ago

I'm using this code to test

public static void main(String[] args) throws Exception {
    String filePath = "file:\\\\C:\\.....\\canvas.html";

    try (final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_68)) {
        webClient.getOptions().setJavaScriptEnabled(true);
        webClient.getOptions().setCssEnabled(true);
        webClient.getOptions().setThrowExceptionOnScriptError(false);

        HtmlPage page = webClient.getPage(filePath);
        webClient.waitForBackgroundJavaScript(20000);

        for (DomElement elem : page.getElementsByTagName("canvas")) {
            HTMLCanvasElement canvas = (HTMLCanvasElement) ((HtmlCanvas) elem).getScriptableObject();
            System.out.println("----");
            System.out.println(canvas.toDataURL("image/png"));
            System.out.println("----");
            break;
        }
    }

Additionally i have enabled the $(document).ready(function(){ stuff in the canvas.html file.

rbri commented 4 years ago

Regarding the obfuscated stuff... Have a look at the geotoolkit js files. The javascript code is in one line and all the var names are generated. Usually this is done by using some js obfuscator. To dive deeper into the code i need the source and not the obfuscated version of that.

nites67 commented 4 years ago

Thanks for the update. I am able to see the letter ft now after i changed the code. Regarding the source code, I have asked the concerned people. Once i get an update , I will let you know. Thanks for the support.

nites67 commented 4 years ago

Hi, I checked with the concerned people and they informed me that they will not be able to share the source code unfortunately. Thank you again for the support.

rbri commented 3 years ago

Will close this because without the code there is no chance for further progress - sorry.

Thanks for giving HtmlUnit a try.