dandi / dandi-api-webshots-tools

0 stars 0 forks source link

AttributeError: 'dict' object has no attribute 'text' #14

Closed yarikoptic closed 2 years ago

yarikoptic commented 2 years ago

since I moved ~/cronlib items under /mnt/backup/dandi/ cron job didn't work for awhile and we are 2 weeks old for https://github.com/dandi/dandi-api-webshots#readme but even that one shows some issues to be addressed "while at it":

but I have failed to run it at all (after removing venv/ so it was recreated):

dandi@drogon:/mnt/backup/dandi/dandi-api-webshots$ flock -n -E 0 /home/dandi/.run/run-webshots.lock /mnt/backup/dandi/dandi-api-webshots/tools/run-webshots.sh dandi
+ DANDI_INSTANCE=dandi
+ PYTHON=/home/dandi/miniconda3/bin/python
++ dirname /mnt/backup/dandi/dandi-api-webshots/tools/run-webshots.sh
+ cd /mnt/backup/dandi/dandi-api-webshots/tools/..
+ git reset --hard HEAD
HEAD is now at 0b51926fb Automatically update webshots
+ git clean -df
+ git checkout master
Already on 'master'
Your branch is up to date with 'origin/master'.
+ git pull
Already up to date.
+ cd tools
+ git checkout master
Already on 'master'
Your branch is up to date with 'origin/master'.
+ git pull
Already up to date.
+ '[' '!' -e venv ']'
+ /home/dandi/miniconda3/bin/python -m virtualenv venv
created virtual environment CPython3.8.3.final.0-64 in 373ms
...
Installing collected packages: six, zipp, python-dateutil, pycparser, typing-extensions, ruamel.yaml.clib, pytz, pyrsistent, numpy, importlib-resources, idna, dnspython, cffi, attrs, arrow, webcolors, urllib3, uri-template, sortedcontainers, sniffio, scipy, ruamel.yaml, rfc3987, rfc3339-validator, pydantic, pandas, outcome, jsonschema, jsonpointer, jeepney, isoduration, h5py, h11, fqdn, email-validator, cryptography, charset-normalizer, certifi, async-generator, wsproto, trio, SecretStorage, requests, PySocks, pyparsing, pyOpenSSL, numcodecs, joblib, importlib-metadata, hdmf, fasteners, click, ci-info, blessings, asciitree, appdirs, zarr, trio-websocket, tenacity, semantic-version, pyout, pynwb, pycryptodomex, packaging, keyrings.alt, keyring, interleave, humanize, fscacher, etelemetry, dandischema, click-didyoumean, selenium, pyyaml, psutil, dandi, click-loglevel
Successfully installed PySocks-1.7.1 SecretStorage-3.3.1 appdirs-1.4.4 arrow-1.2.2 asciitree-0.3.3 async-generator-1.10 attrs-21.4.0 blessings-1.7 certifi-2021.10.8 cffi-1.15.0 charset-normalizer-2.0.12 ci-info-0.2.0 click-8.1.1 click-didyoumean-0.3.0 click-loglevel-0.4.0.post1 cryptography-36.0.2 dandi-0.37.0 dandischema-0.6.0 dnspython-2.2.1 email-validator-1.1.3 etelemetry-0.3.0 fasteners-0.17.3 fqdn-1.5.1 fscacher-0.2.0 h11-0.13.0 h5py-3.6.0 hdmf-3.2.1 humanize-4.0.0 idna-3.3 importlib-metadata-4.11.3 importlib-resources-5.6.0 interleave-0.2.0 isoduration-20.11.0 jeepney-0.7.1 joblib-1.1.0 jsonpointer-2.2 jsonschema-4.4.0 keyring-23.5.0 keyrings.alt-4.1.0 numcodecs-0.9.1 numpy-1.21.5 outcome-1.1.0 packaging-21.3 pandas-1.4.1 psutil-5.9.0 pyOpenSSL-22.0.0 pycparser-2.21 pycryptodomex-3.14.1 pydantic-1.9.0 pynwb-2.0.1 pyout-0.7.2 pyparsing-3.0.7 pyrsistent-0.18.1 python-dateutil-2.8.2 pytz-2022.1 pyyaml-6.0 requests-2.27.1 rfc3339-validator-0.1.4 rfc3987-1.3.8 ruamel.yaml-0.17.21 ruamel.yaml.clib-0.2.6 scipy-1.8.0 selenium-4.1.3 semantic-version-2.9.0 six-1.16.0 sniffio-1.2.0 sortedcontainers-2.4.0 tenacity-8.0.1 trio-0.20.0 trio-websocket-0.9.2 typing-extensions-4.1.1 uri-template-1.2.0 urllib3-1.26.9 webcolors-1.11.1 wsproto-1.1.0 zarr-2.11.1 zipp-3.7.0
WARNING: You are using pip version 21.2.4; however, version 22.0.4 is available.
You should consider upgrading via the '/mnt/backup/dandi/dandi-api-webshots/venv/bin/python -m pip install --upgrade pip' command.
+ . venv/bin/activate
++ '[' venv/bin/activate = /mnt/backup/dandi/dandi-api-webshots/tools/run-webshots.sh ']'
++ deactivate nondestructive
++ unset -f pydoc
++ '[' -z '' ']'
++ '[' -z '' ']'
++ '[' -n /bin/bash ']'
++ hash -r
++ '[' -z '' ']'
++ unset VIRTUAL_ENV
++ '[' '!' nondestructive = nondestructive ']'
++ VIRTUAL_ENV=/mnt/backup/dandi/dandi-api-webshots/venv
++ '[' linux-gnu = cygwin ']'
++ '[' linux-gnu = msys ']'
++ export VIRTUAL_ENV
++ _OLD_VIRTUAL_PATH=/home/dandi/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
++ PATH=/mnt/backup/dandi/dandi-api-webshots/venv/bin:/home/dandi/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
++ export PATH
++ '[' -z '' ']'
++ '[' -z '' ']'
++ _OLD_VIRTUAL_PS1=
++ '[' x '!=' x ']'
+++ basename /mnt/backup/dandi/dandi-api-webshots/venv
++ PS1='(venv) '
++ export PS1
++ alias pydoc
++ true
++ '[' -n /bin/bash ']'
++ hash -r
+ set +x
+ xvfb-run python tools/make_webshots.py -i dandi
2022-03-31T15:04:55-0400 [INFO    ] Process-1[23583]: make_webshots: Logging in ...
2022-03-31T15:04:57-0400 [INFO    ] Process-1[23583]: make_webshots: Cleaning up 5 child processes
Process Process-1:
Traceback (most recent call last):
  File "/home/dandi/miniconda3/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/home/dandi/miniconda3/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "tools/make_webshots.py", line 369, in snapshot_pipe
    with Webshotter(gui_url) as ws:
  File "tools/make_webshots.py", line 87, in __init__
    self.set_driver()
  File "tools/make_webshots.py", line 107, in set_driver
    self.login(os.environ["DANDI_USERNAME"], os.environ["DANDI_PASSWORD"])
  File "tools/make_webshots.py", line 118, in login
    login_text = login_button.text.strip().lower()
AttributeError: 'dict' object has no attribute 'text'
Traceback (most recent call last):
  File "tools/make_webshots.py", line 510, in <module>
    main()
  File "/mnt/backup/dandi/dandi-api-webshots/venv/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/mnt/backup/dandi/dandi-api-webshots/venv/lib/python3.8/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/mnt/backup/dandi/dandi-api-webshots/venv/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/mnt/backup/dandi/dandi-api-webshots/venv/lib/python3.8/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "tools/make_webshots.py", line 435, in main
    stats.append(ff(ds, page))
  File "tools/make_webshots.py", line 293, in __call__
    y = self.pipe.recv()
  File "/home/dandi/miniconda3/lib/python3.8/multiprocessing/connection.py", line 250, in recv
    buf = self._recv_bytes()
  File "/home/dandi/miniconda3/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes
    buf = self._recv(4)
  File "/home/dandi/miniconda3/lib/python3.8/multiprocessing/connection.py", line 379, in _recv
    chunk = read(handle, remaining)
ConnectionResetError: [Errno 104] Connection reset by peer
jwodder commented 2 years ago

I am unable to reproduce the error mentioned in the title; what version of the Python selenium library are you using?

jwodder commented 2 years ago

@mvandenburgh (et alii?): The CSS class of the progress bar shown when loading data for a dandiset/:id/draft/files page seems to have changed. Previously, the webshot script waited for .v-progress-linear to be invisible, but that no longer suffices for checking whether the files listing has loaded. What check should replace it?

yarikoptic commented 2 years ago

I am unable to reproduce the error mentioned in the title; what version of the Python selenium library are you using?

that screen session is still available on dandi@drogon happen you need to interactively troubleshoot. The tail of that pip install in the original message post says selenium-4.1.3... tripple checking:

dandi@drogon:/mnt/backup/dandi/dandi-api-webshots$ venv/bin/python -c 'import selenium; print(selenium.__version__)'
4.1.3
mvandenburgh commented 2 years ago

@mvandenburgh (et alii?): The CSS class of the progress bar shown when loading data for a dandiset/:id/draft/files page seems to have changed. Previously, the webshot script waited for .v-progress-linear to be invisible, but that no longer suffices for checking whether the files listing has loaded. What check should replace it?

.v-progress-linear should still work as a check. I just double checked on production and the progress bar does indeed have the v-progress-linear class initially, and then it becomes invisible once the files list has finished loading (see: the search bar in the dev tools under the element inspector) -

1

2

jwodder commented 2 years ago

@yarikoptic My next best guess is that chromedriver needs to be updated; the current version (in Homebrew, at least) is 99.0.4844.51.

jwodder commented 2 years ago

@mvandenburgh Is the progress bar present when the page is initially loaded, or is it only added after?

yarikoptic commented 2 years ago

@yarikoptic My next best guess is that chromedriver needs to be updated; the current version (in Homebrew, at least) is 99.0.4844.51.

Then I think that I better finally upgrade the entire drogon from Debian stretch to bullseye, where it is now shipped as chromium-driver package 99.0.4844.74-1~deb11u1 . But I don't see how driver would relate to that error (AttributeError: 'dict' object has no attribute 'text') -- did you check what is in that dict? may be just a matter of tuning up the script to handle it?

meanwhile I will initiate upgrade through download of packages etc, later will disable cron jobs for the duration of the upgrade, and we might need to redo some or all the venvs again

jwodder commented 2 years ago

@yarikoptic

did you check what is in that dict?

No, as I'm not getting a dict when I run the script locally, and the screen session on drogon you mentioned above seems to be gone now.

yarikoptic commented 2 years ago

it isn't gone. Just login as dandi, screen -rd, choose 6th (Ctrl-a 6). I just reran to exactly the same effect

jwodder commented 2 years ago

@yarikoptic The login_button dict is just {'ELEMENT': '0.136299170337715-1'}. I don't see a way to make use of that.

mvandenburgh commented 2 years ago

@mvandenburgh Is the progress bar present when the page is initially loaded, or is it only added after?

It's added after. (To be specific, it's added after currentDandiset is initialized)

jwodder commented 2 years ago

@mvandenburgh I've tried to detect the presence of .v-progress-linear with:

WebDriverWait(self.driver, 300).until(
    EC.visibility_of_element_located((By.CLASS_NAME, cls))
)

and

WebDriverWait(self.driver, 300).until(
    EC.presence_of_element_located((By.CLASS_NAME, cls))
)

before waiting for invisibility, but the waits don't seem to ever finish.

yarikoptic commented 2 years ago

@yarikoptic The login_button dict is just {'ELEMENT': '0.136299170337715-1'}. I don't see a way to make use of that.

but why is that so and how it could be "fixed"?

anyways -- I will upgrade drogon now. May be newer chrome driver would help although IMHO unlikely (ideally we should create a singularity image to avoid relying on system state). I have disabled all cron jobs for now.

yarikoptic commented 2 years ago

@jwodder -- drogon is upgraded, chromedriver is 100.0.4896.60 . Please troubleshoot/fix and reenable cron job (most of the cron jobs are disabled ATM with #UPGRADE prefix) for webshots.

jwodder commented 2 years ago

@yarikoptic The issue in the title is no longer occurring, but the code still needs the changes in #15 in order to work properly again.

yarikoptic commented 2 years ago

ok, should that PR then be merged as is (still in Draft) or further changes needed?

jwodder commented 2 years ago

@yarikoptic Depends on whether you can live with the file listings not being snapshotted properly for the time being.

yarikoptic commented 2 years ago

depends on how long the 'time being' would be. I am ok to get some working version to start with but let's have files view fixed asap as well.

@mvandenburgh any hints on how to detect progress bar to disappear? @jwodder - please check how attributes of that object change from the beginning to e.g. 10 seconds after -- might give you a clue on what to wait for in a loop or smth like that.

mvandenburgh commented 2 years ago

@mvandenburgh any hints on how to detect progress bar to disappear?

The snippet that @jwodder posted makes sense to me logically, although I'm not familiar enough with selenium to comment on why it would be failing. The only other approach I could think of would be to wait for all background AJAX requests to finish. In our E2E tests we use puppeteer in a similar manner and do this in several tests https://github.com/dandi/dandi-archive/blob/master/web/test/src/util.js#L26-L35, I'm not sure if selenium has similar functionality.

yarikoptic commented 2 years ago

FWIW here is the SO on networkidle* heuristics of puppeteer: https://stackoverflow.com/questions/63366278/selenium-equivalent-of-networkidle2-networkidle0-in-puppeteer

jwodder commented 2 years ago

@yarikoptic

please check how attributes of that object change from the beginning to e.g. 10 seconds after

I do not know how to do that. Pretty much all I've been doing is inspecting elements in Chrome, which relies on manually clicking on things after the page appears.

yarikoptic commented 2 years ago

I meant to do something at selenium script level like rich.inspect(whateverwatched) or some fancier diff on dir(whateverwatched) in a loop of 10 seconds for e.g. 30 seconds and seeing what changes to make a decision what it is done doing what it was doing.