Closed js1642 closed 4 years ago
Does the session key server log reveal anything?
Also worth checking the logs for the gist service. It looks like getting info on the current user may be failing (/user
- rgithub's get.myself()
) may be failing.
You can enable extensive logs in the gist service by changing INFO
to DEBUG
in logback.xml
. You don't need to restart the service for this. Then watch rcloud-gist-service-file.log
when you attempt to login to RCloud - it may produce hundreds of lines of logging.
Hi @gordonwoodhull , I'll attach some sks logs, these are via journalctl, as I start all the processes as systemd units. They look ok to me, rather sparse. The gist service logs show a connection refused error. I'll attach the first part for XNIO-3 task-1, task 2& 3 are similar. This is a single host setup for now. SKS log
-- Logs begin at Tue 2020-04-07 13:12:38 UTC, end at Fri 2020-05-22 14:01:01 UTC. --
May 22 14:00:02 myhost4.abc.net env[16634]: token: 1590156002800 user='js1642' auth/pam, rcloud:fd0f604f2149f56e46d540589408c2be03ac03f2, VALID
May 22 14:00:02 myhost4.abc.net env[16634]: token: 1590156002637 user='js1642' auth/pam, rcloud:fd0f604f2149f56e46d540589408c2be03ac03f2, VALID
May 22 14:00:02 myhost4.abc.net env[16634]: token: 1590156002488 user='js1642' auth/pam, rcloud:fd0f604f2149f56e46d540589408c2be03ac03f2, VALID
May 22 14:00:01 myhost4.abc.net env[16634]: token: 1590156001108 user='js1642' auth/pam, rcloud:fd0f604f2149f56e46d540589408c2be03ac03f2, VALID
May 22 14:00:01 myhost4.abc.net env[16634]: AUTH/pam: 1590156001101 user='js1642', rcloud, OK
May 22 14:00:01 myhost4.abc.net java[16634]: pam_sss(rcloud:auth): authentication success; logname= uid=0 euid=0 tty= ruser= rhost= user=js1642
[root@myhost4 rcloud-gist-service]#```
And the gist-service log
Sounds like it's either configured wrong or there are network issues. It's too bad that the error message talks about an invalid token when it's really a connection issue.
I guess what I would do next is ssh into the gist services machine and try curling the same request that it is making,
http://myhost4.abc.net:4301/valid?token=fd0f604f2149f56e46d540589408c2be03ac03f2&realm=rcloud
These servers aren't special, they just use ordinary HTTP, so it's sometimes easier to debug individual connections using curl.
Hi @js1642, did you figure this out?
I wasn't clear if you closed the issue on purpose.
Hi @gordonwoodhull , didn't mean to close. I think I've got it going, for the most part. I had a "-l 127.0.0.1" entry in the SKS startup line. When I removed that, it started working. Alas, a new issue. I've imported and created some new notebooks. R and shell run ok, but there's an issue with Python(3.6.8). I run the notebook,(print("hello") and it just hangs. Times out after 600 seconds with "RuntimeError: Kernel didn't respond in 600 seconds", and in the Session pane there's "[IPKernelApp] ERROR | Failed to open SQLite history /home/js1642/.ipython/profile_default/history.sqlite (database is locked). [IPKernelApp] ERROR | History file was moved to /home/js1642/.ipython/profile_default/history-corrupt.sqlite and a new file created." I've removed the .ipython directory and retried, but same issue. Googling around, found multiple possibilites. Our /homes are on a GPFS file system.
Hi @js1642, glad to hear that you worked out the gist/sks issue!
I don't think I have seen or heard this Python issue before.
Just to confirm, this is with rcloud.jupyter
(not rcloud.python
) in your rcloud.languages
, right?
I am sure you have gone over the rcloud.jupyter readme. Standard advice is to make sure that Jupyter runs on its own, to work out any kinks.
I am not an expert in this area but I am glad to help. I did get it running on Linux, but this was a dev machine so it was perhaps easier.
Hi, sorry for the delay in responding. The jupyter install looks ok from my perspective, and matches output from another rc instance: $python3 $python3$resource_dir [1] "/usr/local/share/jupyter/kernels/python3"
$python3$spec $python3$spec$argv [1] "/bin/python3" "-m" "ipykernel_launcher" [4] "-f" "{connection_file}"
$python3$spec$env named list()
$python3$spec$display_name [1] "Python 3"
$python3$spec$language [1] "python"
$python3$spec$interrupt_mode [1] "signal"
$python3$spec$metadata named list() I built a new instance, RC2.2.3, R3.6.0, Python3.8.0 on RedHat7.4. This is using rcloud-gist-service, and I imported from files some test notebooks. R and shell notebooks run fine, but there is am issue with Python3, different from the timeout I’m having in the original issue. Now I’m getting “TypeError: run_cell() takes from 2 to 3 positional arguments but 4 were given” for my complex notebook of “print(“hello world”)”.
The new error sounds like a version mismatch, either between the IPython and Jupyter packages you installed, or between those packages and rcloud.jupyter.
I don't know, but it's possible that there have been breaking changes to Jupyter interfaces since we released rcloud.jupyter 2 years ago. (Tested, see below, unlikely but still possible.)
I'd need to see a stack trace to know which part is failing. I am also not sure if your new instance is failing earlier or later than the old one.
If this arms-length debugging is not helping, I would be happy to meet with you on webex sometime.
I just tried an installation on Fedora 29, R 3.6.1, Python 3.7.2, Jupyter 4.4.0, to make sure the INSTALL.md is okay for a somewhat recent RedHat, and added commands for installing R and Jupyter.
After the usual iterations of installing packages, it ran OK, including Python3.
I guess it's possible that something broke in Python 3.8, but I don't grok multiple Python installs, so I wasn't able to try that.
I’ve got 2 instances working now, RH7.4, R3.6.0, Python3.6.8 and RC2.2.3. On the original box, I upgraded RC to 2.2.3, reinstalled all the R and Python packages. Same timing out problem. I’m a bit confused by the INSTALL.md. It calls for several rpms for python-jupyter support, but all I could find in the RH realm was python36-jupyter-core-4.3.0-2.el7.noarch. Looking at Fedora rpms, these were available. Looking at the spec file for one of the Fedora rpms, it seems to just be essentially the pip package. Should I be using the Fedora rpms, or the pip packages, or both? The https://github.com/att/rcloud/tree/develop/rcloud.packages/rcloud.jupyter README doesn’t mention any rpms needed.
You mentioned needing to see a stack trace. How could I provide this? And if the offer for a webex is still open, that would be great. Thanks.
The rcloud.jupyter has some pip commands which work okay for development, but sudo pip
is generally a bad practice. I pushed a clarification yesterday.
As for the RH instructions, maybe I am missing some knowledge here. I assumed that the commands I used on Fedora,
## jupyter support
sudo yum install python3-jupyter-core python3-ipykernel python3-nbconvert python3-nbformat python3-jupyter-client
would also work on RH? I made a few changes to INSTALL.md about a week ago.
Anyway, I doubt any of this has to do with the issues you are facing!
I was thinking that the stack trace would help with the run_cell()
error you were seeing, but for a timeout, probably not so much.
Hi. I'm having some difficulty setting up rcloud-gist-service. The basis is RCloud 2.2.0 on RHEL 7.4, and rcloud-gist-service-0.3.1-20170512153557, installed via rpm. I believe I've followed the config instructions for both rcloud.conf and application.yml. Here's my rcloud.conf:
This is the non default part of application.yml
And this is the error popup I receive.