openwebwork / webwork2

Course management front end for WeBWorK
http://webwork.maa.org/wiki/Main_Page
Other
141 stars 164 forks source link

Session issue with problem editor #2394

Closed Alex-Jordan closed 19 hours ago

Alex-Jordan commented 3 months ago

In develop, if I clear cookies and then log in and go to the PG editor, I can:

By my count on the 17th time, my session is killed and I have to log back in. This was noticed when exploring https://github.com/openwebwork/pg/issues/1050.

Maybe this has to do with #2391.

drgrice1 commented 3 months ago

Just to clarify, are you saying that this is with #2391? Or are you using the develop branch. I haven't tested with #2391 yet, but I can not reproduce this on the develop branch.

drgrice1 commented 3 months ago

I can't reproduce this with #2391 either.

Alex-Jordan commented 3 months ago

I see it with develop. Other details are I turned off 2fa, the os is oracle 8.7, locally I'm on a Mac with Firefox. Can't do more testing until late tonight. Not sure now which session management is in play. But to be clear, you tried checking 20+ times?

On Sun, Apr 7, 2024, 2:04 PM Glenn Rice @.***> wrote:

I can't reproduce this with #2391 https://protect2.fireeye.com/v1/url?k=31323334-501cfaeb-3132feb7-454455535732-5b4eed059f258eb4&q=1&e=cc735f9d-9b64-4985-b8c5-6d35b2fe6bba&u=https%3A%2F%2Fgithub.com%2Fopenwebwork%2Fwebwork2%2Fpull%2F2391 either.

— Reply to this email directly, view it on GitHub https://protect2.fireeye.com/v1/url?k=31323334-501cfaeb-3132feb7-454455535732-61e055f9d82b9a42&q=1&e=cc735f9d-9b64-4985-b8c5-6d35b2fe6bba&u=https%3A%2F%2Fgithub.com%2Fopenwebwork%2Fwebwork2%2Fissues%2F2394%23issuecomment-2041608505, or unsubscribe https://protect2.fireeye.com/v1/url?k=31323334-501cfaeb-3132feb7-454455535732-be26b7b4a01035df&q=1&e=cc735f9d-9b64-4985-b8c5-6d35b2fe6bba&u=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABEDOAGINFQFUS6ZRIJZYZ3Y4GYENAVCNFSM6AAAAABF3TNYTOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBRGYYDQNJQGU . You are receiving this because you authored the thread.Message ID: @.***>

drgrice1 commented 3 months ago

Easily more than 20+ times. I just kept clicking.

Alex-Jordan commented 3 months ago

This is happening for me on develop. (Since I sort of have develop in production, I should clarify this is a "pure" develop branch. There are no uncommitted edits.) I've run npm ci in both repos just in the off chance that has anything to do with this. In localOverrides.conf, nothing relevant is changed from the default, except maybe $twoFA{enabled} = 0; which I don't think is relevant, but I mention it since it has to do with logging in.

The server is behind a firewall and I access it on PCC's internal network after activating a VPN client. It's name is something-internal.pcc.edu and I'm starting by deleting all cookies for .pcc.edu. To test this issue, I'm making a new course just to be sure there's nothing in a particular course causing trouble. I create the new course from the admin course, then enter it.

I go to the problem editor. I check in on cookies, and I have two at the moment. One of size 1460 (I think for the admin course) and one of size 1468. I type pi into the default problem's answer blank and click Check Answers. In the browser developer tools, the "1468" cookie flashes as if it was rewritten or otherwise accessed. Its size is still 1468. On the 17th click of "Check Answers", the problem no longer renders and I get this pink error:

Rendering error: NetworkError when attempting to fetch resource.

The "1468" cookie is now gone.

Now if I navigate anywhere, I find that I'm logged out. So I log in again. I try the same steps and this time I get bumped off after just 10 tries.

I try pretty much the same thing on the production server (where it's not behind a firewall) and the same thing happens. Although in fewer tries than 17.

Now I'm in the developer tools, Network tab. Corresponding with each successful click of "Check Answers" there are two POST method entries:

Screenshot 2024-04-07 at 8 48 42 PM

They are always like that with the same numbers in the last two columns ("Transferred" and "Size").

But then on the click of "Check Answers" where it fails, it's different:

Screenshot 2024-04-07 at 8 49 05 PM
drgrice1 commented 3 months ago

I did some testing, and although I can not reproduce this still, I think that this is not the same issue as the cookie size issue causing the header size limit to be exceeded. The message that you are getting is different. The issue may still be related to the header size limit issue, but something is clearly different.

I logged into five courses which puts the combined cookie size for the domain high enough to make the header exceed the limit. Then I clicked check answers in the PG editor. Rendering fails as expected because the header size is now too large, but the message that is shown is Rendering error: JSON.parse: unexpected character at line 1 column 1 of the JSON data, not Rendering error: NetworkError when attempting to fetch resource.

The request size shown in the Network tab of the developer tools is consistent with the request getting truncated because of a request exceeding the size limit.

This baffles me. There is no reason that the size limit should affect this. The cookies don't grow in size as more requests are made. The timestamp and cookie expiration are updated with each request, so you will see the cookie change, but the size should remain rather close to the same.

I guess, you should try with #2391, and see if that changes anything.

pstaabp commented 3 months ago

I also can't reproduce this on develop on my development machine. I'm guessing this was with session_cookies setting for session management. I check both this and the "key" method.

Alex-Jordan commented 3 months ago

So I tried with #2391. And believe it or not, that makes things worse! Without #2391, it consistently takes 17 clicks for me to get booted. With #2391, it only takes 10 clicks.

With #2391, I see something that I did not see before. Now when the I get to the last two POSTs when I am booted off, I see this in the Network developer tools:

Screenshot 2024-04-10 at 9 25 01 PM

You can compare that to what I posted before. The successful ones now look like:

Screenshot 2024-04-10 at 9 27 05 PM

So now those successful ones are a lot smaller in the Transferred column, and about the same size (one a tad up, one a tad down) in the Size column. I'm not in the exact same course I was before (which was now deleted) so a small difference might just be because of the course name being different.

FWIW, I also get this behavior with Chrome. The above was with Firefox. (I saw some postings about a FF bug involving NS_ERROR_NET_RESET, so I decided to try Chrome too.)

It seems worth pursuing this as an issue with our local load balancer setup. I'll bring this up with our team here and see if they have thoughts about it, or even better, know exactly what would do this and can toggle something with our setup.

drgrice1 commented 3 months ago

Interesting.

Note that the "Transferred" column is the only one that should have an essentially different size with develop versus #2391. That size includes the cookies. The "size" column is only the response size, which should be essentially the same regardless of if you are using develop or #2391. In fact, in my testing using the same course for both, the "size" column is identical, but the "Transferred" column is considerably smaller with #2391. Here is what I see for successful requests. The first is the padded session from develop, and the second the unpadded session from #2391.

padded unpadded

Now when I previously tested this I tested it locally with morbo, and on a development server (with a production setup) that is serving directly via hypnotoad. So now, I tested it on my local server proxied via apache2 via my ddns. And hello, we have a problem. With the develop branch it happens after 17 "Check Answers" clicks, and with #2391 it happens with 10 "Check Answers" clicks. So this is not your load balancer. It is a more general issue when hypnotoad is proxied (which the load balancer is a form of).

Now that I can reproduce it I will see what I can do to fix it.

drgrice1 commented 3 months ago

By the way, I don't see the NS_ERROR_NET_RESET thing with either branch, but that is probably not significant. It is just that it fails that is significant.

drgrice1 commented 3 months ago

Here is another thing to test. Go to one of the other places that a problem is rendered via the render_rpc endpoint. For example, add a blank problem in a set in the "Sets Manager" (this will give you the same problem of course). Then render the problem, and start clicking "Check Answers". So far, I can't reproduce this issue there. So this seems to be specific to the javascript for the PG problem editor.