Closed yarikoptic closed 3 years ago
@yarikoptic How does one authenticate on https://gui-beta-dandiarchive-org.netlify.app/ using just an API key?
Would be a question which may be @dchiquito and @dandi/dandiarchive could answer better, but I guess it could be done establishing a session for selenium, and clicking on "Login" button, and authenticating (I guess via github?) so may be it is not DANDI_API_KEY which is needed but github credentials (for CI we do it using dandibot account) .
Current status: I can get Selenium to click "Login," fill out the form, and follow the redirect, but I'm not logged in.
@dchiquito How does GitHub communicate authentication to https://gui-beta-dandiarchive-org.netlify.app/? It doesn't seem to set any cookies.
There are three domains in play: gui-beta-dandiarchive-org.netlify.app
, app.dandiarchive.org
, and github.com
. Clicking Login
in the web GUI redirects you to app.dandiarchive.org
, which then redirects you to GitHub. After you log in in GitHub, it redirects back to the API, which then redirects back to the web GUI. I believe HTTP headers are used to shuffle the authentication token around. The token is stored in localStorage, which should be adjacent to cookies in the dev inspector.
When you click Log Out
on the web GUI, you are actually logging out the API server, not GitHub. Clicking Login
after logging out redirects you to the API server, which detects that you are already logged in to GitHub, and immediately redirects you back again. I recommend using an incognito window for determining how the system works for clean logins, there may be some mismatch.
@dchiquito I can confirm that, in a normal Chrome session, local storage is created after logging in. However, nothing seems to be stored in local storage in a Selenium session, based on the empty output from:
print(driver.execute_script(
"var ls = window.localStorage, keys = []; "
"for (var i = 0; i < ls.length; ++i) "
" keys[i] = ls.key(i); "
"return keys;"
))
Note that, for comparison, this returns a non-empty list when using Selenium to view jsbin.com.
EDIT: If I visit different domains before querying local storage, I see that api.dandiarchive.org has no local storage while github.com has a 'bundle-urls' key.
So is it selenium peculiarity somehow, or how could we check server side interactions related to this - may be culprit is there. @jwodder do you have access to the logs?
@yarikoptic I have access to Heroku and the Papertrail logs. Are those what you're referring to?
yes -- I would have compared log'ed activities between "real" (in a new incognito window) and selenium driven cases to possibly identify any difference
I couldn't find anything helpful in the logs, but I did discover that if I add a sleep()
call for a few seconds after logging in, the login ends up being successful.
@dchiquito Is the login process perhaps finalized in the background?
What page are you waiting on? I know there's some stuff that runs when gui.dandiarchive.org
loads to grab the login token from API storage (if it exists) and attempt to get the current user.
For reactive apps, you frequently need to wait for pages to load when doing things in Selenium, since the page state is modified asynchronously.
@dchiquito I'm waiting on the main https://gui-beta-dandiarchive-org.netlify.app/ page, or wherever you get redirected to after authenticating with GitHub. Are there any changes to the DOM upon login completion that I can tell Selenium to wait for instead of having to sleep?
There's a little v-avatar
that shows up with the current user's initials. It's probably enough to wait for the presence of .v-avatar
.
Marking as blocked
since I believe we would need some non-github oauth way to login if I got it right -- our attempts to overcome github safety mechanisms has failed so far
Removed blocked
, we decided to proceed with having this to run on drogon instead of CI -- then that singular browser instance / device id could be properly acknowledged. So please proceed with #11 or some additions from #12 (e.g. tune up of window size etc https://github.com/dandi/dandi-api-webshots/pull/12/files#diff-f8109c43576a9af22cec8b69ea13633420e5aa2b80762e8596ec367e8f0eef2dR130) so it works and then deploy on drogon as a cron job. Should use dandibot
user on github.
@yarikoptic
dandibot
user's credentials be stored?xvfb-run
.secrets.env
shell file defining the env vars to be sourced or could re-use dandi-cli credentials mechanism.@yarikoptic It appears that drogon still needs to be verified for GitHub login somehow. How do I do that?
@jwodder - either open a browser on drogon in X11 mode, or use a section of the code interactively to submit the verification code you receive from github via email through dandibot@mit.edu (i believe it goes to your gmail account)
@yarikoptic I have merged #11, disabled the GitHub Actions workflow, and set up a cronjob to run every 6 hours.
Great!
@yarikoptic PRs to capture "edit metadata" and remove the workflow: #14 and #15
flock -n -E 0 /home/dandi/.run/run-webshots.lock ~/cronlib/dandi-api-webshots/tools/run-webshots.sh
but it crashed with
+ xvfb-run python tools/make_webshots.py
Traceback (most recent call last):
File "tools/make_webshots.py", line 129, in <module>
readme += process_dandiset(driver, ds)
File "tools/make_webshots.py", line 88, in process_dandiset
act()
File "tools/make_webshots.py", line 65, in click_edit
submit_button.click()
File "/home/dandi/cronlib/dandi-api-webshots/venv/lib/python3.8/site-packages/selenium/webdriver/remote/webelement.py", line 80, in click
self._execute(Command.CLICK_ELEMENT)
File "/home/dandi/cronlib/dandi-api-webshots/venv/lib/python3.8/site-packages/selenium/webdriver/remote/webelement.py", line 633, in _execute
return self._parent.execute(command, params)
File "/home/dandi/cronlib/dandi-api-webshots/venv/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/home/dandi/cronlib/dandi-api-webshots/venv/lib/python3.8/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: session deleted because of page crash
from unknown error: cannot determine loading status
from tab crashed
(Session info: chrome=73.0.3683.75)
(Driver info: chromedriver=73.0.3683.75,platform=Linux 4.19.0-0.bpo.9-amd64 x86_64)
so I guess we should guard for exceptions like this, may be retry few times?
set -x
somewhere or do it after sourcing secrets@yarikoptic
set +x
added.which user is being used? "Edit metadata" is only available for the owner of the dandiset.
@yarikoptic It seems that the page content fails to load only if (but not always when) the window size option is given. This happens on the first Dandiset, so I don't know whether it's specific to that Dandiset or not.
@satra We're using dandibot.
i don't think dandibot is an owner of any dandiset. so it won't be able to "click on" the "edit metadata". that button would be disabled for that user. you could test if it can by adding dandibot as owner to one of the test dandisets.
it's possible that because you are using xvfb that there is a conflict with window size. try running the script in headless mode without xvfb but providing window size through options.
btw, this may be a function of chromedriver which is now at 91+.
try running the script in headless mode without xvfb but providing window size through options.
That seems to have worked, but now the script keeps dying on Dandiset 000020 with "invalid session id". When I tested the script locally earlier, I also noticed that the script would just pause indefinitely at 000020 until I killed it.
i don't get a crash (with my version of the script) locally, but i had to insert a sleep between getting the url and taking the screenshot. but i'm only testing one dandiset. could you start with 20? i wonder if it's some function of loading. you could check if your script runs fine in the container or not.
if it helps my chrome/chromedriver in the container (joyzoursky/python-chromedriver:3.8-selenium)
(Session info: headless chrome=79.0.3945.79)
# chromedriver --version
ChromeDriver 78.0.3904.105 (60e2d8774a8151efa6a00b1f358371b1e0e07ee2-refs/branch-heads/3904@{#877})
@satra If I start at 000020, I just get the "invalid session ID" error at 000021 instead.
what is the source of the "invalid session ID" - selenium? web ui?
FWIW, drogon is running an elderly Debian stretch. May be it is worth generating/using a freshier "everything"? I have created a singularity image which you can try using on drogon:
dandi@drogon:~$ singularity exec /mnt/backup/dandi/containers/webshots.simg chromedriver --version
ChromeDriver 89.0.4389.114 (1ea76e193b4fadb723bfea2a19a66c93a1bc0ca6-refs/branch-heads/4389@{#1616})
I think it should have everything, if not - just adjust a copy of /mnt/backup/dandi/containers/Singularity.webshots and point me to it, I will produce a newer one (needs root access)
@yarikoptic
what is the source of the "invalid session ID" - selenium? web ui?
Selenium.
If I run the script under Singularity, it hangs indefinitely at the edit-metadata step for 000021.
and that is "great" in that -- it is what it does for me too, and browser asks to kill that page. So please add a timeout (30 sec?) and README.md should have that ("timed out at 30" or alike) for timing
@yarikoptic The stalling occurs when the "Edit Metadata" button is clicked. I can't figure out how to add a timeout for that. I tried doing driver.set_page_load_timeout(30)
and changing the click()
call to send_keys(Keys.ENTER)
as suggested here, but it's not making a difference.
shouldn't the script be checking if "edit metadata" is enabled, before trying to click it?
regarding timeout, you can do something like this. the details of presence would vary for different use cases.
try:
element = WebDriverWait(driver, 5).until(
EC.presence_of_element_located((By.NAME, "allow"))
)
except TimeoutException:
pass
else:
element.click()
@satra It shouldn't matter if it's enabled; clicking a disabled button should not cause the script to hang indefinitely. The button is already present when the page is loaded, so I don't see how your code applies to this situation.
that piece of code waits to see if a button is available before clicking. you can set the number 5 to whatever you want. on different tests, etc.,. it could take a bit of time to load all the elements. also you can check if there was a timeout before an element was detected before you click. it simply ensures a time window within which an element can appear. things don't happen as instantaneously on the driver as we might perceive them to.
@satra I don't believe that the existence of the button is the problem. The problem is that the page legitimately hangs indefinitely when the button is clicked, and I can't figure out how to set a timeout for that.
re "greyed out Edit Metadata" -- I thought that dandibot was one of the "admin users" which would be able to trigger it since otherwise we would not have a way to get those webshots. I have checked in incognito window -- nope, dandibot cannot "edit metadata". Filed: https://github.com/dandi/dandiarchive/issues/647 I have added dandibot to 000029 though and verified that now it can edit metadata there.
@jwodder if no better way found -- can't we do each "selenium test" in a new thread and kill the thread if times out, or that would leave shared selenium instance still hanging? Then we might want to instantiate it within a thread (would be pain with all the logging in though I am afraid) ?
BTW, you might get a better idea on where it is hanging by py-spy top -p
that process and then see at the corresponding code/stack on either some provisions are made for timing out etc or how possibly to get it "unstuck"
@yarikoptic py-spy
shows that it's hanging on a socket call to readinto
. I can't find any Selenium settings for controlling that timeout. I would rather not resort to threads.
@jwodder if you shared the full traceback, I could have look as well. We would need to either see it addressed at selenium level OR go for threads if that is the only way (I would also hate to introduce them)
@yarikoptic The traceback from the latest run (in which it hung for about five to ten minutes on 000022's edit-metadata before crashing) is:
Traceback (most recent call last):
File "/home/dandi/cronlib/dandi-api-webshots/tools/make_webshots.py", line 151, in <module>
readme += process_dandiset(driver, ds)
File "/home/dandi/cronlib/dandi-api-webshots/tools/make_webshots.py", line 97, in process_dandiset
act()
File "/home/dandi/cronlib/dandi-api-webshots/tools/make_webshots.py", line 70, in click_edit
submit_button.send_keys(Keys.ENTER)
File "/usr/lib/python3/dist-packages/selenium/webdriver/remote/webelement.py", line 510, in send_keys
self._execute(Command.SEND_KEYS_TO_ELEMENT,
File "/usr/lib/python3/dist-packages/selenium/webdriver/remote/webelement.py", line 672, in _execute
return self._parent.execute(command, params)
File "/usr/lib/python3/dist-packages/selenium/webdriver/remote/webdriver.py", line 318, in execute
self.error_handler.check_response(response)
File "/usr/lib/python3/dist-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: session deleted because of page crash
from unknown error: cannot determine loading status
from tab crashed
(Session info: headless chrome=89.0.4389.114)
well, this is already post hanging error handling, traceback of when it was still waiting would have been more useful. Looking at the code, it seems it might be using urllib3's urlopen.. I wonder if a hint (well against urllib2) in https://stackoverflow.com/questions/16772795/urllib2-urlopen-will-hang-forever-despite-of-timeout would still work here
import socket
import urllib2
# timeout in seconds
timeout = 10
socket.setdefaulttimeout(timeout)
i.e. if we set timeout to e.g. 30 sec -- would we get to this WebDriverException
faster to announce the fail
FWIW I tried with this obnoxious
from selenium import webdriver
import socket
# timeout in seconds
socket.setdefaulttimeout(0.0001)
browser = webdriver.Chrome()
browser.get('https://gui.dandiarchive.org/#/dandiset/000026')
browser.close()
and it timed out before succeeding ;-)
edit: although timed out not in interaction with dandi but with browser itself :-) making it 0.5 sec -- timed out on that get
@yarikoptic I'm trying to test out socket.setdefaulttimeout()
, but it appears that the login button isn't loading in time, leading to an IndexError, despite having set driver.implicitly_wait(10)
.
30 seconds isn't enough for login???
there is already a stab https://github.com/dandi/dandi-api-webshots/blob/master/tools/make_webshots.py#L49 but we would need to have authenticated session (based on having
DANDI_API_KEY
env var provided) to be able to enter meditor for a dataset for a user which either has admin privileges or just owns a specific dandiset.