atlas-engineer / nyxt

Nyxt - the hacker's browser.
https://nyxt-browser.com/
9.86k stars 413 forks source link

What's the principle behind Nyxt that manipulate DOM elements? #1163

Closed cl-03 closed 3 years ago

cl-03 commented 3 years ago

Which packages are used behind Nyxt that operates DOM elements,these actions like MOUSE-CLICK, that switch the window to the href-links。 I want to automate the Nyxt process which like selenium but not use selenium which refer to Java or Python。 I've been thinking about this for a long time。 would you mind teaching me? Can I find it in the file 《*.asd》? Maybe the principle is that ,located the DOM element ,then find the links around it ,then invoke DEXADOR or DRAKMA to request the URL I guess.But how Nyxt knew the DOM element been actived and the cursor located on it? It can be found that request a url until select DOM elements in the book 《common lisp cookbooks》,but without next steps that interact with DOM elements.

aartaka commented 3 years ago

There are several ways to interact with DOM elements as of now:

cl-03 commented 3 years ago

One more question,how to hides a browser to evaluate JS that generate complete html page for the dynamic sites codes?

cl-03 commented 3 years ago

and how to get the refreshed page after activing JS event?

aartaka commented 3 years ago

and how to get the refreshed page after activing JS event?

Something like

(ps:@ document body |innerHTML|) ; Same as document.body.innerHTML in JS

Should work if you use it in Parenscript. That's the simplest way.

webkit_web_view_save() can be a reliable alternative here, but it's clunky to use.

aartaka commented 3 years ago

One more question,how to hides a browser to evaluate JS that generate complete html page for the dynamic sites codes?

I'm not sure I've understood what you mean there. Can you rephrase, please?

English is not my native language, so thanks for your patience :)

cl-03 commented 3 years ago

English is not my native language too,thanks for your kindness。 Have you ever read the book --The Common Lisp Cookbook》? All my needs began here:In the chapter --Web Scraping of 《The Common Lisp Cookbook》:

(ql:quickload '("dexador" "plump" "lquery" "lparallel"))
(defvar *url* "https://portal.astronergy.com/")
(defvar *request* (dex:get  *url* :basic-auth  '("chuangxiu.chen" . "CX4644cx") :verbose t))
(defvar *parsed-content* (lquery:$ (initialize *request*)))
(defvar *css-selector* "#content li") ;;;the *css-selector*can changed according to my needs
(lquery:$ *parsed-content* "#content li")

when crawling dynamic website rather static websites。Even after parse the dynamic page ,we couldn't get the DOM element that generate from JS code。

CL-USER> (lquery:$ *parsed-content* "#content li")
#()

Some experts told me:find some ways that to evaluate JS that generate complete html page without browser。 In other words,gets the function of Selenium that handle with dynamic web,but use CommonLisp other than JAVA or Python,and execute the dynamic page's Html&JS&CSS code to generate the final page that what users saw.with no broswer. So that we could get element by css selector after parsing page next step because the dynamic page have execute JS&HTML&CSS coedes and all elements are final elements.

cl-03 commented 3 years ago

For example: the web source codes are:

 <!DOCTYPE html><html><head><meta charset=utf-8><meta name=viewport content="width=device-width,initial-scale=1"><meta http-equiv=X-UA-Compatible content="IE=edge,chrome=1"><link href=./src/assets/imgs/icon.ico rel=Astronergy><title>Astronergy</title><link href=/static/css/app.849e6a6de55373d90f3e11186746153e.css rel=stylesheet></head><body><div id=app></div><script>var theUA = window.navigator.userAgent.toLowerCase();
-- | --
  | if ((theUA.match(/msie\s\d+/) && theUA.match(/msie\s\d+/)[0]) \|\| (theUA.match(/trident\s?\d+/) && theUA.match(/trident\s?\d+/)[0])) {
  | var ieVersion = theUA.match(/msie\s\d+/)[0].match(/\d+/)[0] \|\| theUA.match(/trident\s?\d+/)[0];
  | if (ieVersion < 10) {
  | var str = "对不起!您的浏览器版本太低了!建议您升级当前浏览器版本!";
  | var str2 = "IT8585服务热线:56038585";
  | document.writeln("<pre style='text-align:center;color:#fff;background-color:#196ED0; height:100%;border:0;position:fixed;top:0;left:0;width:100%;z-index:1234;'>" +
  | "<h2 style='padding-top:200px;margin:0;font-size: 25px;line-height: 2;'><strong>" + str + "<br/></strong></h2><h2  style='font-size: 20px;line-height: 2;'>" + str2);
  | // document.writeln("<pre style='text-align:center;color:#fff;background-color:#196ED0; height:100%;border:0;position:fixed;top:0;left:0;width:100%;z-home:1234'>" +
  | //   "<h2 style='padding-top:200px;margin:0'><strong>" + str + "<br/></strong></h2><h2>" +
  | //   str2 + "</h2><h2 style='margin:0'><strong>如果你的使用的是双核浏览器,请切换到极速模式访问<br/></strong></h2></pre>");
  | document.execCommand("Stop");
  | }
  | }</script><script type=text/javascript src=/static/js/manifest.2ae2e69a05c33dfc65f8.js></script><script type=text/javascript src=/static/js/vendor.ef71fa13e5b25a02f9ac.js></script><script type=text/javascript src=/static/js/app.864bcc9bb650a558e9fd.js></script></body></html>

but the page user saw is:

<html>
    <head>
        <meta charset="utf-8">
        <meta name="viewport" content="width=device-width,initial-scale=1">
        <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
        <link href="./src/assets/imgs/icon.ico" rel="Astronergy">
        <title>
            Astronergy
        </title>
        <link href="/static/css/app.849e6a6de55373d90f3e11186746153e.css" rel="stylesheet" type="text/css">
    </head>
    <body>
        <div id="app">
            <div id="oaHome">
                <div id="oaHeader">
                    <div class="topHeaderDiv">
                        <img src=
                        "data:iK5CYII="
                        alt="">
                        <div>
                            <p class="userP">
                                <span>astronergy\chuangxiu.chen</span>
                            </p>
                            <p>
                                <span>2021-02-20</span> <span>星期六</span>
                            </p>
                            <div class="langDiv">
                                <img src=
                                "data:imageCYII="
                                alt=""> <img src=
                                "data:image/ggg=="
                                alt="">
                            </div>
                        </div>
                    </div>
                    <div class="headerImgDiv">
                        <img src="/static/img/bigImg33.5201adb.jpg" alt="">
                    </div>
                    <div class="headerMenus">
                        <div>
                            <ul>
                                <li class="activeLI">首页
                                </li>
                                <li class="">工作
                                </li>
                                <li class="">应用系统
                                </li>
                            </ul>
                        </div>
                    </div>
                </div>
                <div class="oaHomeSection">
                    <div class="leftNav">
                        <div class="leftNotice">
                            <div class="userModule">
                                <div class="userInfo" style="position: relative;">
                                    <div role="tooltip" id="el-popover-3424" aria-hidden="true" class="el-popover el-popper" tabindex="0" style="display: none;">
                                        <span><!----></span>
                                        <div class="hoverShow">
                                            <p>
                                                <span><span class="">姓名:</span> <span class="">陈创修</span></span>
                                            </p>
                                            <p>
                                                <span class="">域账号:</span> <span title="astronergy\chuangxiu.chen" class="">astronergy\chuangxiu.chen</span>
                                            </p>
                                            <p>
                                                <span class="">工号:</span> <span class="" title="180807019">180807019</span>
                                            </p>
                                            <p>
                                                <span class="">职位:</span> <span class="" title="研发工程师">研发工程师</span>
                                            </p>
                                            <p>
                                                <span class="">邮箱:</span> <span class="" title="chuangxiu.chen@astronergy.com">chuangxiu.chen@astronergy.com</span>
                                            </p>
                                        </div>
                                    </div><button type="button" class="el-button el-button--default el-popover__reference" aria-describedby="el-popover-3424" tabindex="0"><!----><!----><span><img src=
                                    "data:image/pmK3DhTgA+tAH/9k="
                                    alt=""> <!----></span></button>
                                </div>
                                <div style="text-align: center; font-size: 16px; padding-top: 9px;">
                                    <span>陈创修</span>
                                </div>
                                <div class="userWork" style="display: none;">
                                    <div>
                                        <img src="/static/img/apply.9776816.jpg" alt="">
                                        <p style="cursor: pointer;">
                                            <span>我的审批</span> <span>(0)</span>
                                        </p>
                                    </div>
                                    <div>
                                        <img src="/static/img/work.4211e68.jpg" alt="">
                                        <p style="cursor: pointer;">
                                            <span>我的申请</span> <span>(2)</span>
                                        </p>
                                    </div>
                                </div>
                                <div class="userWork testGH">
                                    <div style="cursor: pointer;">
                                        <img src="/static/img/apply.9776816.jpg" alt="">
                                        <p style="cursor: pointer;">
                                            <span>我的审批</span> <span>(0)</span>
                                        </p>
                                    </div>
                                    <div style="cursor: pointer;">
                                        <img src="/static/img/work.4211e68.jpg" alt="">
                                        <p style="cursor: pointer;">
                                            <span>我的申请</span> <span>(2)</span>
                                        </p>
                                    </div>
                                    <div style="cursor: pointer;">
                                        <img src=
                                        "data:image/png;ErkJggg=="
                                        alt="">
                                        <p style="cursor: pointer;">
                                            <span>工会信箱</span>
                                        </p>
                                    </div>
                                </div>
                                <div class="WXcode">
                                    <div class="qywxApp">
                                        <img src="data:im=" alt="">
                                        <p>
                                            <span>扫码下载</span> <span>企业微信</span>
                                        </p>
                                    </div>
                                    <div class="chintFWH">
                                        <img src="data:iQmCC" alt="" class="bigGW"> <span>扫码关注</span> <span>正泰新能源官微</span>
                                    </div>
                                    <p class="downQY">
                                        企业微信安装说明
                                    </p>
                                </div>
                            </div>
                            <div class="companyNotice">
                                <div class="noticeHeader">
                                    <span class="spanFirst">公司公告</span> <span class="spanSecond"><img src=
                                    "data:image/gg=="
                                    alt=""></span>
                                </div>
                                <ul class="noticeSection">
                                    <li>
                                        <img src=
                                        "data:iSuQmCC"
                                        alt="">
                                        <p title="正新司发[2021]14号:关于开展2021年正泰新能源方针目标宣贯的通知" class="noticeTitle">
                                            正新司发[2021]14号:关于开展2021年正泰新能源方针目标宣贯的通知
                                        </p>
                                        <p class="noticeDate">
                                            2021-02-05
                                        </p>
                                    </li>
                                    <li>
                                        <img src=
                                        "data:i
                                alt=""></span>
                            </div>
                            <ul class="noticeSection">
                                <li>
                                    <p title="【招聘】内部竞聘通知—售电配网事业部、户用光伏事业部" class="noticeMain">
                                        【招聘】内部竞聘通知—售电配网事业部、户用光伏事业部
                                    </p>
                                    <p class="noticeDate">
                                        2021-01-13
                                    </p>
                                </li>
                                <li>
                                    <p title="【招聘】内部竞聘通知—运营管控总监" class="noticeMain">
                                        【招聘】内部竞聘通知—运营管控总监
                                    </p>
                                    <p class="noticeDate">
                                        2021-01-06
                                    </p>
                                </li>
                                <li>
                                    <p title="【招聘】内部竞聘通知—项目协调" class="noticeMain">
                                        【招聘】内部竞聘通知—项目协调
                                    </p>
                                    <p class="noticeDate">
                                        2020-12-21
                                    </p>
                                </li>
                                <li>
                                    <p title="【招聘】内部竞聘通知—东、中部电站项目开发" class="noticeMain">
                                        【招聘】内部竞聘通知—东、中部电站项目开发
                                    </p>
                                    <p class="noticeDate">
                                        2020-11-24
                                    </p>
                                </li>
                                <li>
                                    <p title="【人事】2020年高新区(滨江)第三期初级职称申报通知" class="noticeMain">
                                        【人事】2020年高新区(滨江)第三期初级职称申报通知
                                    </p>
                                    <p class="noticeDate">
                                        2020-11-24
                                    </p>
                                </li>
                                <li>
                                    <p title="【招聘】大丰项目工作意向调查" class="noticeMain">
                                        【招聘】大丰项目工作意向调查
                                    </p>
                                    <p class="noticeDate">
                                        2020-11-10
                                    </p>
                                </li>
                                <li>
                                    <p title="【招聘】内部竞聘通知—综合能源服务首批项目公司负责人" class="noticeMain">
                                        【招聘】内部竞聘通知—综合能源服务首批项目公司负责人
                                    </p>
                                    <p class="noticeDate">
                                        2020-10-29
                                    </p>
                                </li>
                                <li>
                                    <p title="【招聘】内部竞聘通知—电站运维事业部" class="noticeMain">
                                        【招聘】内部竞聘通知—电站运维事业部
                                    </p>
                                    <p class="noticeDate">
                                        2020-10-16
                                    </p>
                                </li>
                            </ul>
                        </div>
                        <div class="shortcutModule">
                            <ul class="addressModule">
                                <li>
                                    <span>通讯录</span> <img src=
                                    "data:imWE8USlAAGJqhwMJjYF/geDUdUy0Ao24gAAAABJRU5ErkJggg=="
                                    alt="">
                                </li>
                                <li>【常用电话】
                                </li>
                                <li>
                                    <button type="button" class="el-button searchAddress el-button--default"><!----><!----><span>查询</span></button>
                                </li>
                            </ul>
                            <ul class="hotelModule">
                                <li>
                                    <span>酒店查询</span>
                                </li>
                                <li>
                                    <p>
                                        【杭州】
                                    </p>
                                    <p>
                                        【上海】
                                    </p>
                                    <p>
                                        【温州】
                                    </p>
                                    <p>
                                        【乐清】
                                    </p>
                                </li>
                                <li>
                                    <div class="el-input el-input-group el-input-group--append">
                                        <!----><input type="text" autocomplete="off" placeholder="请输入城市名称" class="el-input__inner"><!----><!---->
                                        <div class="el-input-group__append">
                                            <button type="button" class="el-button el-button--default" style="cursor: pointer;"><!----><!----><span>查询</span></button>
                                        </div><!---->
                                    </div>
                                </li>
                            </ul>
                            <ul class="friendShipModule">
                                <li>
                                    <span>友情链接</span>
                                </li>
                                <li>
                                    <div class="el-select">
                                        <!---->
                                        <div class="el-input el-input--suffix">
                                            <!----><input type="text" readonly="readonly" autocomplete="off" placeholder="请选择公司" class="el-input__inner"><!----><span class="el-input__suffix"><span class="el-input__suffix-inner"><!----><!----><!----><!----><!----></span><!----></span><!----><!---->
                                        </div>
                                        <div class="el-select-dropdown el-popper" style="display: none; min-width: 257.5px;">
                                            <div class="el-scrollbar" style="">
                                                <div class="el-select-dropdown__wrap el-scrollbar__wrap" style="margin-bottom: -17px; margin-right: -17px;">
                                                    <ul class="el-scrollbar__view el-select-dropdown__list">
                                                        <!---->
                                                        <li class="el-select-dropdown__item">
                                                            <span>正泰集团</span>
                                                        </li>
                                                        <li class="el-select-dropdown__item">
                                                            <span>正泰新能源</span>
                                                        </li>
                                                        <li class="el-select-dropdown__item">
                                                            <span>正泰电气</span>
                                                        </li>
                                                        <li class="el-select-dropdown__item">
                                                            <span>正泰电器</span>
                                                        </li>
                                                        <li class="el-select-dropdown__item">
                                                            <span>正泰电源</span>
                                                        </li>
                                                        <li class="el-select-dropdown__item">
                                                            <span>上海诺雅克</span>
                                                        </li>
                                                        <li class="el-select-dropdown__item">
                                                            <span>正泰中自</span>
                                                        </li>
                                                    </ul>
                                                </div>
                                                <div class="el-scrollbar__bar is-horizontal">
                                                    <div class="el-scrollbar__thumb" style="transform: translateX(0%);"></div>
                                                </div>
                                                <div class="el-scrollbar__bar is-vertical">
                                                    <div class="el-scrollbar__thumb" style="transform: translateY(0%);"></div>
                                                </div>
                                            </div><!---->
                                        </div>
                                    </div>
                                </li>
                            </ul>
                        </div>
                    </div>
                </div>
                <div data-v-66017908="" id="footer">
                    <p data-v-66017908="">
                        © 2020 Astronergy. All rights reserved.
                    </p>
                    <p data-v-66017908="">
                        <img data-v-66017908="" src="/static/img/beiAn.d0289dc.png" alt=""> <a data-v-66017908="" href="http://beian.miit.gov.cn/" target="_blank">浙ICP备17046924号</a>
                    </p>
                </div>
            </div>
        </div><script type="text/javascript">
var theUA = window.navigator.userAgent.toLowerCase();
        if ((theUA.match(/msie\s\d+/) && theUA.match(/msie\s\d+/)[0]) || (theUA.match(/trident\s?\d+/) && theUA.match(/trident\s?\d+/)[0])) {
        var ieVersion = theUA.match(/msie\s\d+/)[0].match(/\d+/)[0] || theUA.match(/trident\s?\d+/)[0];
        if (ieVersion < 10) {
          var str = "对不起!您的浏览器版本太低了!建议您升级当前浏览器版本!";
          var str2 = "IT8585服务热线:56038585";
          document.writeln("<pre style='text-align:center;color:#fff;background-color:#196ED0; height:100%;border:0;position:fixed;top:0;left:0;width:100%;z-index:1234;'>" +
            "<h2 style='padding-top:200px;margin:0;font-size: 25px;line-height: 2;'><strong>" + str + "<br/><\/strong><\/h2><h2  style='font-size: 20px;line-height: 2;'>" + str2);
          // document.writeln("<pre style='text-align:center;color:#fff;background-color:#196ED0; height:100%;border:0;position:fixed;top:0;left:0;width:100%;z-home:1234'>" +
          //   "<h2 style='padding-top:200px;margin:0'><strong>" + str + "<br/><\/strong><\/h2><h2>" +
          //   str2 + "<\/h2><h2 style='margin:0'><strong>如果你的使用的是双核浏览器,请切换到极速模式访问<br/><\/strong><\/h2><\/pre>");
          document.execCommand("Stop");
        }
        }
        </script><script type="text/javascript" src="/static/js/manifest.2ae2e69a05c33dfc65f8.js">
</script><script type="text/javascript" src="/static/js/vendor.ef71fa13e5b25a02f9ac.js">
</script><script type="text/javascript" src="/static/js/app.864bcc9bb650a558e9fd.js">
</script>
    </body>
</html>
cl-03 commented 3 years ago

Sorry ,I don't know how to paste web page codes on github Issues module,So it looks confusion。

cl-03 commented 3 years ago

that is: we do the broswer works for dynamic page manually that render the dynamic page to the page what users saw

cl-03 commented 3 years ago

Maybe it likes Nyxt handle with dynamic web codes I guess,just the operations before invoke GUI to drawing the page

aartaka commented 3 years ago

Ah, I found it! You can get the current HTML of the page opened in Nyxt with (nyxt::document-get-body). Then you can parse it with any parser you like :)

If the page source is long enough, play with different values of limit in document-get-body, e.g., (document-get-body :limit 1000000 #|or whatever value you find sufficient to parse all the page|#)

cl-03 commented 3 years ago

without browser--Nyxt,can we make like that too?like the package PhantomJS in Python or like SpiderMonkey or CL-JavaScript ?Interpret and execute JS codes without browser

cl-03 commented 3 years ago

Where can I find the definitions of the function (nyxt::document-get-body) except the file--《bookmark.lisp》? Or the raw sources of the function (document-get-body) belongs to which package?

aartaka commented 3 years ago

Where can I find the definitions of the function (nyxt::document-get-body) except the file--《bookmark.lisp》? Or the raw sources of the function (document-get-body) belongs to which package?

The definition of document-get-body is in renderer-script.lisp, at line 29. It looks like:

(define-parenscript document-get-body (&key (limit 100000))
  (ps:chain document body |innerHTML| (slice 0 (ps:lisp limit))))
aartaka commented 3 years ago

without browser--Nyxt,can we make like that too?like the package PhantomJS in Python or like SpiderMonkey or CL-JavaScript ?Interpret and execute JS codes without browser

Nyxt cannot run in headless mode yet, so you still need to open Nyxt window to make scripts run in it.

This is a neat feature to have, though. Let me open an issue about configuring Nyxt as headless.

aartaka commented 3 years ago

Opened #1168 :)

cl-03 commented 3 years ago

I have benefited a lot from your patient guidance. Thank you very much for your precious help。

cl-03 commented 3 years ago

Does Nyxt have such a component in its source code now:the input is dynamic page source code ,the ouput is the page JS have executed。 For examples: input:

<html>
<head></head>
<script type=text/javascript src=/static/js/manifest.2ae2e69a05c33dfc65f8.js></script>
</html>

output:

<html>
<head></head>
**_<li>
                                    <p>
                                        【杭州】
                                    </p>
                                    <p>
                                        【上海】
                                    </p>
                                    <p>
                                        【温州】
                                    </p>
                                    <p>
                                        【乐清】
                                    </p>
                                </li>_**
<script type=text/javascript src=/static/js/manifest.2ae2e69a05c33dfc65f8.js></script>
</html>

In my opinions, since Nyxt is an well-worked broswer,it must have the core that can interpreter source code of dynamic page and execute until starting visualization。If I can find it,maybe my questions solved

cl-03 commented 3 years ago

I seem got it ,it is just a function (with-html-output-to-string)maybe.

aartaka commented 3 years ago

I'm not sure that with-html-output-to-string is going to evaluate the script. In any way, Nyxt-native alternative is:

(html-set "your HTML here")
(document-get-body)
cl-03 commented 3 years ago

It's (ffi-buffer-evaluate-javascript-async),import from the package :cl-webkit2,that execute JS. webkit2:webkit-web-view-evaluate-javascript

(defun webkit-web-view-evaluate-javascript (web-view javascript &optional call-back error-call-back)
  "Evaluate JAVASCRIPT in WEB-VIEW calling CALL-BACK upon completion."
  (incf callback-counter)
  (push (make-callback :id callback-counter :web-view web-view
                       :function call-back
                       :error-function error-call-back)
        callbacks)
  (webkit-web-view-run-javascript
   web-view javascript
   (cffi:null-pointer)
   (cffi:callback javascript-evaluation-complete)
   (cffi:make-pointer callback-counter)))
(defcfun "webkit_web_view_run_javascript" :void
  (web-view (g-object webkit-web-view))
  (script :string)
  (cancellable :pointer)
  (callback g-async-ready-callback)
  (user-data :pointer))

the core is JS-VirtualMachine

aartaka commented 3 years ago

webkit-web-view-evaluate-javascript is indeed used a lot in Nyxt via the fffi-buffer-evaluate-javascript(-async) :)

cl-03 commented 3 years ago

Can package lquery and Plump do the same as webkit-web-view-evaluate-javascript that execute js codes to render dynamic page?

aartaka commented 3 years ago

Can package lquery and Plump do the same as webkit-web-view-evaluate-javascript that execute js codes to render dynamic page?

No. Plump and LQuery work with the text of the page and cannot evaluate/change it, while Nyxt works with the page itself and can alter it using arbitrary JS.

cl-03 commented 3 years ago

Are there raw Open source package of COMMON LISP that like Webkit,that I just load the package rather than install Webkit and then load Cl-webkit2?In other words,pure CommonLisp.

jmercouris commented 3 years ago

No, there is nothing like that available. Closure is the closest thing, but it is insufficient for the modern web :-(

cl-03 commented 3 years ago

Maybe I should define my own macro that expand the dynamic page to html page the user finally saw without syntax check,and so on,just a Executor.