Closed cdbeon closed 3 years ago
This same issue was also reported yesterday on r.hub on behalf of another student by @d-alex-hughes. @lytello also reported this.
It was one off at the time, but I'm worried this may not be a one off. If this becomes widespread, we'll need some instructions for how to reproduce so we can track down and fix the underlying cause.
This can be fixed by renaming or removing ~/.rstudio via the terminal.
To do so while bypassing the typical rstudio session startup:
mv .rstudio .rstudio.$(date +%s)
and press returnEDIT:
This fixed it for me: https://support.rstudio.com/hc/en-us/articles/218730228-Resetting-a-user-s-state-on-RStudio-Server
HOWEVER, the problem comes back almost immediately and the Preview version of RStudio is unusable for me.
Given that I run this frequently, I have some commands I've used to do this quickly without resetting my settings each time.
#!/bin/bash
# cp -r ~/.config/rstudio/ ~/.config/rstudiobak/ | echo failed
# cp -r ~/.local/share/rstudio/ ~/.local/share/rstudiobak/ | echo failed
# cp -r ~/.rstudio/ ~/.rstudiobak/ | echo failed
rm -rf ~/.config/rstudio/
rm -rf ~/.local/share/rstudio/
rm -rf ~/.rstudio/
cp -r ~/.config/rstudiobak/ ~/.config/rstudio/
cp -r ~/.local/share/rstudiobak/ ~/.local/share/rstudio/
cp -r ~/.rstudiobak/ ~/.rstudio/
I'm here because I'm having a similar issue on RStudio Server.
Here's what I know. When I am using the preview version, I get the error. When I revert to the current release of RStudio Server, the error disappears. When reinstalling RStudio Server Preview (or daily build) the error returns. I am able to get to the log in but the error occurs after submitting my password and before the IDE appears.
@lytello @d-alex-hughes @cdbeon Any other students report this issue? If not I may close this for now and reopen later if this appears to be a consistently occurring problem.
Thanks for digging into this with the student who raised it @felder. 🎉 Their instance is running sans problems now and I haven't had any new cases bubble up in the last few days.
If it does rise up again, I'll bring it back into this issue and notify you.
@felder I had 5 more cases (likely more that didn't require my assistance). The ones that did reach out had lost their work. From my asking questions, these students weren't regularly saving their files when using datahub, which I suspect is why this additional issue occurred.
Just had another MIDS student raise this issue on slack.
Someone just reported this via DS-infrastructure email as well.
@ericvd-ucb it is likely that MIDS student. I directed them to reach out via email.
@felder I've had 3 more cases since then
@cdbeon any word on how to reproduce?
I'm really going to need some assistance with creating some sort of reliable set of steps to reproduce this issue in order to solve it.
@felder Not yet; I've been trying to ask students what they've been doing in their previous session(s), but I've had a mix of cases where students have been clearing the environment / using RStudios normally and those who have been overloading their global environment. All of them were doing the same thing -- working on the previous week's lab -- which is pretty uniform and shouldn't cause any problems (as evident by the 98% of the class who doesn't have the error).
In the meantime, I'll try to trigger the error myself and keep you updated!
@cdbeon ok thank you!
Ok so I've been playing around with this issue using docker on my workstation so that I have more debugging capability. Additionally I grabbed the entire home directory from a student who had previously experienced this issue.
Here's what I discovered so far:
Merely copying just the .rstudio directory of a person experiencing this is enough to reproduce the issue. None of the other student's files are necessary.
I was utterly unable to get any logging whatsoever to work out of rstudio-server. In fact it appears that the logging options are all largely restricted to rstudio server pro. However, I attached strace to the running rserver process and tried to launch a new rstudio session with the broken .rstudio in place. strace at least gives me a little idea of what the rserver process is trying to do.
Looking through the strace output, I can find the last file in .rstudio that rserver attempted to access. It seems to be the "options" file for whatever session it is attempting to resume. For example:
grep ".rstudio" strace.txt ... openat(AT_FDCWD, "/home/rstudio/.rstudio/sessions/active/session-960e4455/suspended-session-data/options", O_RDONLY) = 7
If I remove that file, rstudio works and retains the info in the console.
The output in the console is whatever the student was doing last followed by something that looks like this: Error: C stack usage 7969504 is too close to the limit Error saving session (options): R code execution error
So I suspect that the error above results in the data for the file "options" being corrupted and rstudio gets mad when it parses it. Why this happens? I have no idea.
Googling C stack usage is too close to the limit does pull up this intriguing result: https://stackoverflow.com/questions/14719349/error-c-stack-usage-is-too-close-to-the-limit
Additionally, googling: "error saving session" "r code execution error"
Also pulls up some stuff, but really nothing conclusive.
I'm hoping some instructors familiar with the course material may have some insight.
Relevant commands:
docker exec --privileged -it --user=root ${CONTAINERID} /bin/bash --login
strace -f -e 'trace=!clock_gettime,gettimeofday,futex,timerfd_settime,epoll_wait,epoll_ctl' -p ${PID}
@felder wow, that's awesome debugging work! Thank you <3
In the meantime, I think we can just remove (or rename) all files in .rstudio that match this description on our NFS server. What do you think?
@yuvipanda i have no idea what the result of doing that would be especially on active sessions or others not currently experiencing the issue. However, would it be possible to rig a url that provides a button that deletes these files (or renames .rstudio) when pushed?
@felder looks like removing ~/.rstudio should be safe in our context - https://support.rstudio.com/hc/en-us/articles/218730228-Resetting-a-user-s-state-on-RStudio-Server.
I don't think it's quite possible to write a url for this, unfortunately. Users can visit classic notebook via https://r.datahub.berkeley.edu/hub/user-redirect/tree and maybe do it from there.
But, I think the right thing to do is to possibly remove the state files from inside ~/.rstudio for all users who aren't currently running. We should be able to get a list and do that. What do you think?
@yuvipanda My belief at this time is that this may be caused by the fact that datahub and r hub both mount the same user filesystem but have completely different versions of R as well as different versions of various R libraries. For example in addition to using R 4.0.2, datahub uses the system packaged texlive libs. R hub uses texlive libs installed via tlmgr.
I do not believe simply deleting .rstudio for all users will provide a permanent fix for this. I think such a fix would only be temporary.
In my opinion we should remove rstudio from datahub and use a single hub for rstudio exclusively. Alternatively each hub needs its own config file location for rstudio if those hubs mount a shared filesystem. Another possibility would be to setup a new filesystem for r hub.
Based on this, I don't know if rstudio can actually be instructed to behave differently: https://community.rstudio.com/t/change-rstudio-from-rstudio-server/8248
Note also, that our setup is quite similar to the the setup that is being described as "not recommended." Basically we probably should do something to ensure that multiple instances/versions of rstudio are not competing with each other.
@felder This makes a lot of sense! Can any of the recent reporters confirm that they've used R on both hubs recently? Can you reproduce if you open and close RStudio on one hub then open it on the other? Or if you run RStudio simultaneously on both?
Its not obvious we can configure the path to ~/.rstudio/ in a given user environment. We could do something hacky^Wclever and bind mount it elsewhere.
We've only been using r.datahub.berkeley.edu for our class!
@ryanlovett I tried to reproduce by flipping back and forth between hubs with rstudio open, but could not. However, I don’t really have any code running and I did not do a lot of switching. Also, I did not try a combination of doing things like killing pods and starting new pods up.
@lytello @cdbeon do you know if any of your students that had this issue used rstudio in multiple hubs such as datahub/r hub ? Also any sense at this point if it’s a small or large percentage of students seeing it?
@cdbeon any idea if any of your students may have other classes that use rstudio via datahub?
@d-alex-hughes @blulightspecial @lytello @cdbeon I just wanted to comment on communications here - @Felder is working mighty hard to try to troubleshoot this one but its taking some time. For now there is a workaround mentioned above
We would love to get your help to communicate on this - could you please communicate out to your classes ... and for now save the ds-infrastructure email for instructor level communications? Thanks
Workaround until a fix can be identified and implemented This can be fixed by renaming or removing ~/.rstudio via the terminal.
To do so while bypassing the typical rstudio session startup:
Go to https://r.datahub.berkeley.edu/user-redirect/tree
Click New->Terminal
In the terminal, type: mv .rstudio .rstudio.$(date +%s)
and press return
Try to launch rstudio as you normally would and it should now work.
I have switched between r.datahub and datahub instances pretty frequently, invoking rstudio instances from the main datahub by editing url, and haven't ever generated the error that triggered this round.
Generally, wiping the state space of anyone's instance at logout/timeout/spindown is consistent with practices in the community.
https://mobile.twitter.com/hadleywickham/status/1032665959734108160?lang=en
(But suggested elsewhere too.)
@felder - Today I ran into the issue described here https://github.com/berkeley-dsep-infra/datahub/issues/1899#issuecomment-706386592
@ipietri Any chance you can reproduce this by doing the same thing you were doing before the error occurred?
Also do you ever use rstudio in both datahub and r hub, or do you just use one hub?
@d-alex-hughes yeah that's definitely a solution under consideration. However, if we go that route we need to do our best to communicate to students that they need to make sure they save their notebooks prior to logging off (we should encourage this anyway).
Additionally, if a student loses network connectivity and during that time their pod dies, they may also lose work.
Hi, I wasn't doing anything really. It just happened when I tried to open the datahub in the morning today. I implemented the suggested solution (below) and is working now.
On Wed, Oct 21, 2020 at 11:06 AM felder notifications@github.com wrote:
@ipietri https://github.com/ipietri Any chance you can reproduce this by doing the same thing you were doing before the error occurred?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/berkeley-dsep-infra/datahub/issues/1899#issuecomment-713764805, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQXPLK3SVKHS6CZIOXB7IDDSL4PK3ANCNFSM4SKOXMAQ .
@ipietri datahub or r hub, also do you ever switch back and forth while using rstudio?
When I say datahub I mean this link ( https://r.datahub.berkeley.edu/user/isabelgarpietri/rstudio/), where I get access to RStudio. What do you mean If I switch back and forth? Like if I stop working there and then come back? If that is your question, yes I do that.
On Wed, Oct 21, 2020 at 11:48 AM felder notifications@github.com wrote:
@ipietri https://github.com/ipietri datahub or r hub, also do you ever switch back and forth while using rstudio?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/berkeley-dsep-infra/datahub/issues/1899#issuecomment-713799288, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQXPLK5N5UQVFCYFVNQH4MLSL4UG3ANCNFSM4SKOXMAQ .
@ipietri there are multiple hubs... https://r.datahub.berkeley.edu, https://datahub.berkeley.edu, https://data100.datahub.berkeley.edu, etc
Each hub has different configurations to serve different classes and use cases. Some students may be in multiple classes that utilize different hubs. It's possible to run rstudio in more than one.
Hello! Sorry for the delay in my response, been a hectic week.
Eric reached out to me via email, and I thought pasting my responses to his email would help:
1) How common is this - like 1 in 100, 10 in 100 - how many students are facing this?
I believe this issue is starting to become more and more common, around 10 in 100 students. About 30 students have come up to me (so far) with this issue.
2) Are people using the URL datahub.berkeley.edu or r.datahub.berkeley,edu
The class is using r.datahub.berkeley.edu, but I'm sure that some undergrad students are also using datahub.berkeley.edu for other classes. I'm not sure if they're using datahub.berkeley.edu to launch RStudios though; to my knowledge, not that many other classes use RStudios in the first place.
3) Does the fix proposed in 1899 work , or does the problem recur
For most students, the fix in 1899 works; unfortunately, I just had one student pretty recently (i.e. yesterday) who brought up the issue a second time (despite using the fix). I managed to just delete the copy (
rm -r .rstudio.bak
) and make a new one (mv .rstudio .rstudio.$(date +%s)
) and it seems to work again. According to the student, they only use r.datahub.berkeley.edu and only for this class as well.
4) The proposed next step would be to clear all user sessions at logout - could that work for your users - or could you communicate that to your users ( eg save all work and logout at end of session)
We've been pushing this to students after every lab/every announcement, but of course you'll always have those students who never heed the warning. We'll continue to tell our students to save and logout at the end of every session, though!
With #2035, we have separate .rstudio
directories in home for datahub & r hub, but using the exact same image. This should help if the problem is two different R / rstudio versions sharing the same .rstudio file was the cause.
The other option to explore is to see if RStudio is being given a proper opportunity to shut down cleanly by jupyter-rsession-proxy, or if it is being killed straight up - that could also cause corruption.
Hopefully this is less of an issue this semester?
I see no reports of this since #2035, so am gonna close this for now \o/. Please re-open if you run into this again.
Hello all,
I've had a few students come up to me with the following error when trying to log into their RStudios:
RStudio Initialization Error: Error occurred during transmission.
The first report that I got of this issue came early today at around 12 AM.
Restarting their servers (going to the admin hub, clicking
stop server
, thenstart server
) didn't seem to work either.Any suggestions as to how to resolve this issue?
Thank you!