kaliiiiiiiiii / Selenium-Driverless

undetected Selenium without usage of chromedriver
https://kaliiiiiiiiii.github.io/Selenium-Driverless/
Other
545 stars 66 forks source link

TypeError: 'JSUnserializable' object is not subscriptable #84

Closed samyeid closed 6 months ago

samyeid commented 12 months ago
from selenium_driverless import webdriver
from selenium_driverless.types.by import By
import asyncio
async def main():
    options = webdriver.ChromeOptions()
    async with webdriver.Chrome(options=options) as driver:
        await driver.get('https://www.google.com/', wait_load=False)
        element =  await driver.find_element(By.NAME,'q')
        element.send_keys('domain.com')
        input()
asyncio.run(main())

Traceback (most recent call last):
  File "C:\Users\eid\Pys\python\SB.py", line 11, in <module>
    asyncio.run(main())
  File "C:\Users\eid\AppData\Local\Programs\Python\Python39\lib\asyncio\runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "C:\Users\eid\AppData\Local\Programs\Python\Python39\lib\asyncio\base_events.py", line 642, in run_until_complete
    return future.result()
  File "C:\Users\eid\Pys\python\SB.py", line 8, in main
    element =  await driver.find_element(By.NAME,'q')
  File "C:\Users\eid\Pys\python\venv\lib\site-packages\selenium_driverless\webdriver.py", line 662, in find_element
    return await target.find_element(by=by, value=value, parent=parent, timeout=timeout)
  File "C:\Users\eid\Pys\python\venv\lib\site-packages\selenium_driverless\types\target.py", line 523, in find_element
    return await parent.find_element(by=by, value=value, timeout=timeout)
  File "C:\Users\eid\Pys\python\venv\lib\site-packages\selenium_driverless\types\webelement.py", line 251, in find_element
    return elems[idx]
TypeError: 'JSUnserializable' object is not subscriptable
``` >

can i know how to solve this bug ?

kaliiiiiiiiii commented 12 months ago

@samyeid Thank's for reporting this bug. Can you provide what the following script returns?:

from selenium_driverless import webdriver
from selenium_driverless.types.by import By
import asyncio
async def main():
    options = webdriver.ChromeOptions()
    async with webdriver.Chrome(options=options) as driver:
        await driver.get('https://www.google.com/', wait_load=False)
        elements =  await driver.find_elements(By.NAME,'q')
        print(elements)
asyncio.run(main())
kaliiiiiiiiii commented 12 months ago

This issue has already be referenced at https://github.com/kaliiiiiiiiii/Selenium-Driverless/issues/59#issuecomment-1747240038, but couldn't reproduce it:(

kaliiiiiiiiii commented 12 months ago

@samyeid UPDATE: The following script (fixed yours a little):

from selenium_driverless import webdriver
from selenium_driverless.types.by import By
import asyncio

async def main():
    options = webdriver.ChromeOptions()
    async with webdriver.Chrome(options=options) as driver:
        await driver.get('https://www.google.com/', wait_load=False)
        element = await driver.find_element(By.NAME, 'q', timeout=10)
        await element.write("domain.com")
        input("press ENTER to quit")
asyncio.run(main())

runs just fine for me. I need some feedback regarding https://github.com/kaliiiiiiiiii/Selenium-Driverless/issues/84#issuecomment-1751128616 . Also, maybe try using fesh venv and install pip install selenium-driverless beforehand

samyeid commented 12 months ago

@samyeid UPDATE: The following script (fixed yours a little):

from selenium_driverless import webdriver
from selenium_driverless.types.by import By
import asyncio

async def main():
    options = webdriver.ChromeOptions()
    async with webdriver.Chrome(options=options) as driver:
        await driver.get('https://www.google.com/', wait_load=False)
        element = await driver.find_element(By.NAME, 'q', timeout=10)
        await element.write("domain.com")
        input("press ENTER to quit")
asyncio.run(main())

runs just fine for me. I need some feedback regarding #84 (comment) . Also, maybe try using fesh venv and install pip install selenium-driverless beforehand

i tried that the same error appeared when i comment the following lines

element = await driver.find_element(By.NAME, 'q', timeout=10)

await element.write("domain.com")

it works without any errors but error appeared when i try to find elements and write and click i think there was bug on package to deal with elements

kaliiiiiiiiii commented 12 months ago

it works without any errors but error appeared when i try to find elements and write and click i think there was bug on package to deal with elements

@samyeid with was, do you mean it's resolved now? If that is the case, please close this issue.

samyeid commented 12 months ago

it works without any errors but error appeared when i try to find elements and write and click i think there was bug on package to deal with elements

@samyeid with was, do you mean it's resolved now? If that is the case, please close this issue.

no problem not resolved but i clarify to you problem is when i try to find element and click on it how can i fix that error please

samyeid commented 12 months ago

@samyeid Thank's for reporting this bug. Can you provide what the following script returns?:

from selenium_driverless import webdriver
from selenium_driverless.types.by import By
import asyncio
async def main():
    options = webdriver.ChromeOptions()
    async with webdriver.Chrome(options=options) as driver:
        await driver.get('https://www.google.com/', wait_load=False)
        elements =  await driver.find_elements(By.NAME,'q')
        print(elements)
asyncio.run(main())

when i run this code it print that for me

JSUnserializable(type="IdOnly",description="None", sub_type="None", class_name="None", value=None, obj_id="7659211699302666109.2.2", context_id=2)

kaliiiiiiiiii commented 12 months ago

@samyeid

JSUnserializable(type="IdOnly",description="None", sub_type="None", class_name="None", value=None, obj_id="7659211699302666109.2.2", context_id=2)

huh that's weird. https://github.com/kaliiiiiiiiii/Selenium-Driverless/blob/2af42506f3d57a78be975c2023785cea4737f7f2/src/selenium_driverless/types/webelement.py#L297-L305

  1. What does the following return then (maybe it's a timing issue)?
    from selenium_driverless import webdriver
    from selenium_driverless.types.by import By
    import asyncio
    async def main():
    options = webdriver.ChromeOptions()
    async with webdriver.Chrome(options=options) as driver:
        await driver.get('https://www.google.com/', wait_load=False)
        elements =  await driver.find_elements(By.NAME,'q')
    await asyncio.sleep(1)
        elements =  await driver.find_elements(By.NAME,'q')
        print(elements)
    asyncio.run(main())
  2. What platform are you running on?
  3. what version ov Chrome do you run on (check on chrome://version)
samyeid commented 12 months ago
from selenium_driverless import webdriver
from selenium_driverless.types.by import By
import asyncio
async def main():
    options = webdriver.ChromeOptions()
    async with webdriver.Chrome(options=options) as driver:
        await driver.get('https://www.google.com/', wait_load=False)
        elements =  await driver.find_elements(By.NAME,'q')
await asyncio.sleep(1)
        elements =  await driver.find_elements(By.NAME,'q')
        print(elements)
asyncio.run(main())

1- JSUnserializable(type="IdOnly",description="None", sub_type="None", class_name="None", value=None, obj_id="401219336495417372.2.3", context_id=2)

2- os windows 8.1 3- chrome://version 109.0.5414.120 (Official Build) (32-bit)

kaliiiiiiiiii commented 12 months ago
from selenium_driverless import webdriver
from selenium_driverless.types.by import By
import asyncio
async def main():
    options = webdriver.ChromeOptions()
    async with webdriver.Chrome(options=options) as driver:
        await driver.get('https://www.google.com/', wait_load=False)
        elements =  await driver.find_elements(By.NAME,'q')
await asyncio.sleep(1)
        elements =  await driver.find_elements(By.NAME,'q')
        print(elements)
asyncio.run(main())

1- JSUnserializable(type="IdOnly",description="None", sub_type="None", class_name="None", value=None, obj_id="401219336495417372.2.3", context_id=2)

2- os windows 8.1 3- chrome://version 109.0.5414.120 (Official Build) (32-bit)

Thanks a lot! Pretty convinced it's an issue with either:

At that point, I can't really help you either:(

kaliiiiiiiiii commented 11 months ago

let's close it for now, as I can't reproduce it.

Nickelsink commented 11 months ago

Hello! Faced the problem while trying to retrieve most basic navigator details, like:

result = {
        ...
        "clipboard": driver.execute_script("return navigator.clipboard;"),
        "connection": driver.execute_script("return navigator.connection;"),
        ...
}

The output was:

'clipboard': JSUnserializable(type="platformobject",description="Clipboard", sub_type="None", class_name="Clipboard", value=None, obj_id=-4888677053857946457.1.5, context_id=1), 
'connection': JSUnserializable(type="platformobject",description="NetworkInformation", sub_type="None", class_name="NetworkInformation", value=None, obj_id=-4888677053857946457.1.7, context_id=1), 

Got it running on WSL2 Windows 10 as well as in Docker container (Debian-bullseye).

Hope this might help searching for the problem origins.

kaliiiiiiiiii commented 11 months ago

Hello! Faced the problem while trying to retrieve most basic navigator details, like:

result = {
        ...
        "clipboard": driver.execute_script("return navigator.clipboard;"),
        "connection": driver.execute_script("return navigator.connection;"),
        ...
}

The output was:

'clipboard': JSUnserializable(type="platformobject",description="Clipboard", sub_type="None", class_name="Clipboard", value=None, obj_id=-4888677053857946457.1.5, context_id=1), 
'connection': JSUnserializable(type="platformobject",description="NetworkInformation", sub_type="None", class_name="NetworkInformation", value=None, obj_id=-4888677053857946457.1.7, context_id=1), 

Got it running on WSL2 Windows 10 as well as in Docker container (Debian-bullseye).

Hope this might help searching for the problem origins.

@Nickelsink Yep, that's expected tho. Basically, chrome doesn't send more about that object back to driverless. What you can do instead is to send the data you need back as a JSON object.

// .. your code
return {data1: 1, key2: "test"}

hope you get the idea

kaliiiiiiiiii commented 11 months ago

@Nickelsink @samyeid Supported types are: undefined, null, string, number, boolean, bigint, regexp, date, symbol, array, object, function, map, set, weakmap, weakset, error, proxy, promise, typedarray, arraybuffer, node, window, generator. btw see https://vanilla.aslushnikov.com/?Runtime.DeepSerializedValue

For some types, the value isn't accessible tho

kevinvoswebdevelop commented 8 months ago

@kaliiiiiiiiii I'm still experiencing this exact same issue on Ubuntu 22.04.3 LTS using Google Chrome Stable 110.0.5481.96 (Official Build) (64-bit). Only by executing your example for nowsecure.nl mentioned in the README.md under Usage > with asyncio.

Sometimes it works though when I tweak a little with the timeouts and sleep, but only once in a while. Repeating using the same timeouts after a successful execution, results in an error again (TypeError: 'JSUnserializable' object is not subscriptable). It looks like some kind of race condition... but I have no clue.

kaliiiiiiiiii commented 8 months ago

@kaliiiiiiiiii I'm still experiencing this exact same issue on Ubuntu 22.04.3 LTS using Google Chrome Stable 110.0.5481.96 (Official Build) (64-bit). Only by executing your example for nowsecure.nl mentioned in the README.md under Usage > with asyncio.

@kevinvoswebdevelop Have you tested if this occurs on Chrome~=120 as well?

It looks like some kind of race condition... but I have no clue.

Yeah smells like one. Tho, wouldn't know any reason why that would be the case. Still assume it's a bug in Chromium

kevinvoswebdevelop commented 8 months ago

@kaliiiiiiiiii well the problem is there is still an ongoing issue with integrated GPU and chrome versions higher than I'm using right now (https://ubuntuforums.org/showthread.php?t=2494588, https://github.com/brave/brave-browser/issues/33619, https://github.com/electron/electron/issues/32317). I only have Iris XE, no other GPU, so unfortunately I can't use newer chrome versions until that bug is fixed...

Can you maybe point me in the right direction, so I can try to fix this problem on my own?

kaliiiiiiiiii commented 8 months ago

@kevinvoswebdevelop Actually, maybe https://github.com/kaliiiiiiiiii/Selenium-Driverless/commit/ceab876a81a6af84c0b827aff9df55f2cdfbd4c3#diff-eb355e1147cb9c933fac6232a336122a60ef8bafb5a2661496aeb58ba142f741L260-R262 fixes this.

Maybe you check if this is fixed now? pip install https://github.com/kaliiiiiiiiii/Selenium-Driverless/archive/refs/heads/dev.zip

kevinvoswebdevelop commented 8 months ago

@kaliiiiiiiiii It doesn't solve my issue but i still think it's a good fix because now it waits for the element rather than throwing this type error. Could it be that your driver loads the elements of the cloudflare bot check page first and does not refresh the elements list when redirected to nowsecure.nl?

kaliiiiiiiiii commented 8 months ago

@kevinvoswebdevelop

when i run this code it print that for me

JSUnserializable(type="IdOnly",description="None", sub_type="None", class_name="None", value=None, obj_id="7659211699302666109.2.2", context_id=2)

That's what Chrome returns instead of a XPathResult. To be more specific: It might be a XPathResult in fact, however, as it's IdOnly, it doesn't include the information that it is infact a XPathResult and derefore can't be parsed.

The weird part here is, that https://github.com/kaliiiiiiiiii/Selenium-Driverless/blob/2af42506f3d57a78be975c2023785cea4737f7f2/src/selenium_driverless/types/webelement.py#L297-L305

uses "deep" and not "IdOnly"

Could it be that your driver loads the elements of the cloudflare bot check page first and does not refresh the elements list when redirected to nowsecure.nl?

Don't think that's the issue.

It doesn't solve my issue but i still think it's a good fix because now it waits for the element rather than throwing this type error.

Yeah I mean more: Does it work tho? As a workaround.

kevinvoswebdevelop commented 8 months ago

@kaliiiiiiiiii Ah alright, that clarifies it. Well the NoElementException still gets thrown for the nowsecure.nl example code, but only after it has awaited the timeout, instead of directly throwing the typeerror. So still a good fix, but can't get the xpathresult. Same goes for By.CSS as well.

I will try to debug this and tweak your code a bit to see if I can get it to work. Because it worked for a few times before sporadically, so there must be something here.

kaliiiiiiiiii commented 8 months ago

@kevinvoswebdevelop

Well the NoElementException still gets thrown for the nowsecure.nl example code, but only after it has awaited the timeout, instead of directly throwing the typeerror. So still a good fix, but can't get the xpathresult.

Oh well - let's maybe revise it then tho to point to throw some error with a notice to this issue

kevinvoswebdevelop commented 8 months ago

@kaliiiiiiiiii the following code res = await self.__target__.execute_cdp_cmd("Runtime.callFunctionOn", args, timeout=timeout) outputs the following: {'result': {'type': 'object', 'subtype': 'array', 'className': 'NodeList', 'description': 'NodeList(1)', 'objectId': '-7561175164979534690.3.256', 'preview': {'type': 'object', 'subtype': 'array', 'description': 'NodeList(1)', 'overflow': False, 'properties': [{'name': '0', 'type': 'object', 'value': 'a.btn.btn-lg.btn-secondary.fw-bold.border-white.bg-white', 'subtype': 'node'}]}}}

When I use the By.CSS_SELECTOR using 'a.btn' to find the button on nowsecure.nl.

It has no problem finding the object it seems, but maybe information is missing. I think you can maybe see instantly what goes wrong here? If not, I'll keep continuing debugging/fixing the issue.

EDITED:

I see that deep = None and value = None because the result property (in my output) does not contain a 'value' property and thus value remains None which results in parse_deep returning a JSUnserializable object.

EDITED AGAIN:

I also see the following output sometimes: {'result': {'type': 'object', 'subtype': 'node', 'className': 'HTMLAnchorElement', 'description': 'a.btn.btn-lg.btn-secondary.fw-bold.border-white.bg-white', 'objectId': '-2638331155481806451.4.275', 'preview': {'type': 'object', 'subtype': 'node', 'description': 'a.btn.btn-lg.btn-secondary.fw-bold.border-white.bg-white', 'overflow': True, 'properties': [{'name': 'target', 'type': 'string', 'value': ''}, {'name': 'download', 'type': 'string', 'value': ''}, {'name': 'ping', 'type': 'string', 'value': ''}, {'name': 'rel', 'type': 'string', 'value': ''}, {'name': 'relList', 'type': 'object', 'value': 'DOMTokenList(0)', 'subtype': 'array'}]}}}

It switches between both outputs it seems.

kaliiiiiiiiii commented 8 months ago

@kevinvoswebdevelop

https://github.com/kaliiiiiiiiii/Selenium-Driverless/blob/4b71a5ab59a193d41eab80ed8f68a66e8ad5c230/src/selenium_driverless/types/deserialize.py#L157-L159

I see that deep = None and value = None because the result property (in my output) does not contain a 'value' property and thus value remains None which results in parse_deep returning a JSUnserializable object.

oh so you mean that due to deep=res.get('deepSerializedValue') being None, it automatically assumes it's unserializable?

If I understand it corretcly, removing the following lines should fix this? If yes, can you check that? https://github.com/kaliiiiiiiiii/Selenium-Driverless/blob/4b71a5ab59a193d41eab80ed8f68a66e8ad5c230/src/selenium_driverless/types/deserialize.py#L538-L543

kevinvoswebdevelop commented 8 months ago

@kaliiiiiiiiii That was exactly what I was thinking. With some other tweaks I managed to get the JSObject returned. But now I get stuck in the while loop used inside find_element function (target.py:573). I think it has something to do with the await calls and the sleep. The while loop keeps running when elem is not None anymore, I can't get it to return the found elem. Python has been a while for me so I have to reread how asyncio works again, keep you updated.

kevinvoswebdevelop commented 8 months ago

@kaliiiiiiiiii I needed the WebElement ofcourse not JSObject. So I fixed that now. In parse_deep the _type variable should fallback on subtype if deep is false like so _type = deep.get("type") if deep else subtype. But I still need to get backendNodeId for the WebElement which is nowhere to be found in my result from cdp for the HTMLAnchorElement because _value is None. Do you have any clue how to fix that?

kevinvoswebdevelop commented 8 months ago

@kaliiiiiiiiii Oh never mind haha! I got it fixed! return await WebElement(backend_node_id=_value.get('backendNodeId') if _value else None just return None for backendNodeId if there is no _value.

kevinvoswebdevelop commented 8 months ago

@kaliiiiiiiiii I also changed the target.py find_element function because somehow the while condition remained true. This works for me:

    async def find_element(self, by: str, value: str, parent=None, timeout: int or None = None) -> WebElement:
        start = time.monotonic()
        elem = None

        while True:
            parent = await self._document_elem
            try:
                elem = await parent.find_element(by=by, value=value, timeout=None)
            except (StaleElementReferenceException, NoSuchElementException, StaleJSRemoteObjReference):
                await self._on_loaded()
            if elem is not None or (not timeout) or (time.monotonic() - start) > timeout:
                break
            await asyncio.sleep(0.5)

        if elem is None:
            raise NoSuchElementException()
        return elem

Do you want me to create an PR with my changes?

Important note: not elem is somehow always true, even when JSObject or WebElement is returned. You have to explicitly check if elem is not None. Very odd.

kevinvoswebdevelop commented 8 months ago

Oh it seems that I only got By.CSS_SELECTOR working. By.XPATH still returns JSUnserializable. I will figure this out later and create a PR when I got that fixed too

kaliiiiiiiiii commented 8 months ago

Oh it seems that I only got By.CSS_SELECTOR working. By.XPATH still returns JSUnserializable. I will figure this out later and create a PR when I got that fixed too

@kevinvoswebdevelop yeah that would be great. I really can't reproduce this issue on my machine.

kaliiiiiiiiii commented 8 months ago

not resolved yet lol

kaliiiiiiiiii commented 8 months ago

Oh it seems that I only got By.CSS_SELECTOR working. By.XPATH still returns JSUnserializable. I will figure this out later and create a PR when I got that fixed too

@kevinvoswebdevelop Any updates on that?

kaliiiiiiiiii commented 6 months ago

Oh it seems that I only got By.CSS_SELECTOR working. By.XPATH still returns JSUnserializable. I will figure this out later and create a PR when I got that fixed too

closing due to inactivity

@kevinvoswebdevelop Feel free to provide what you've got so far any time:)

ganyu87 commented 5 months ago

i met this error

i tried same script: local - windows (python 3.12 from windows download) = fine vps - almalinux (python 3.10 from aapanel) = fine vps - ubuntu (python 3.8 from aapanel) = fine vps - ubuntu (python 3.10 from pyenv) = Error

from selenium_driverless import webdriver
from selenium_driverless.types.by import By
import asyncio

async def main():
    options = webdriver.ChromeOptions()
    options.add_argument("--headless=new")
    options.add_argument("--no-sandbox")
    async with webdriver.Chrome(options=options,debug=True) as driver:
        await driver.get('https://github.com', wait_load=True)
        await driver.sleep(2)
        print(await driver.title)
        try:
            a = await driver.execute_script("return document.querySelector('title').innerText;")
            print(a)
            b = await driver.find_element(By.CSS,'title')
            print(await b.text)
        except Exception as e:
            print(str(e))

asyncio.run(main())

result:

DevTools listening on ws://127.0.0.1:47797/devtools/browser/431d3287-9b9c-41d7-816e-9e2a697a60fd
[0424/053821.126310:ERROR:sandbox_linux.cc(376)] InitializeSandbox() called with multiple threads in process gpu-process.
[0424/053821.134316:ERROR:command_buffer_proxy_impl.cc(125)] ContextResult::kTransientFailure: Failed to send GpuControl.CreateCommandBuffer.
[0424/053821.605108:INFO:CONSOLE(1)] "Uncaught SyntaxError: Failed to execute 'matches' on 'Element': ':modal' is not a valid selector.", source: https://github.githubassets.com/assets/vendors-node_modules_oddbird_popover-polyfill_dist_popover_js-7bd350d761f4.js (1)
[0424/053821.644206:INFO:CONSOLE(1)] "Uncaught SyntaxError: Failed to execute 'matches' on 'Element': ':modal' is not a valid selector.", source: https://github.githubassets.com/assets/vendors-node_modules_oddbird_popover-polyfill_dist_popover_js-7bd350d761f4.js (1)
[0424/053822.865773:ERROR:gl_utils.cc(318)] [.WebGL-0x1cda00928d00]GL Driver Message (OpenGL, Performance, GL_CLOSE_PATH_NV, High): GPU stall due to ReadPixels
[0424/053823.384874:INFO:CONSOLE(0)] "[.WebGL-0x1cda00928d00]GL Driver Message (OpenGL, Performance, GL_CLOSE_PATH_NV, High): GPU stall due to ReadPixels", source: https://github.com/ (0)
GitHub: Let’s build from here · GitHub
GitHub: Let’s build from here · GitHub
('find_elements returned not a list. This possibly is related to https://github.com/kaliiiiiiiiii/Selenium-Driverless/issues/84\n', JSUnserializable(type="IdOnly",description="None", sub_type="None", class_name="None", value=None, obj_id=-7376430453938284161.2.2, context_id=2))

i can take care this issue by reinstall my vps, due i just use it for learning. but if u want, i can give u access to fresh vps to reproduce it

ganyu87 commented 5 months ago

LMAO i am not able to reproduce it after reinstall vps