Amebis / eduVPN

Windows eduVPN Client
GNU General Public License v3.0
41 stars 16 forks source link

Needlessly asked for authorization, after restarting Let's connect client it works #211

Closed JsBergbau closed 2 years ago

JsBergbau commented 2 years ago

I'm using a VM where Let's connect VPN is running. This VM is suspended by saving the state. Before VM is suspended VPN connection is disconnected, but Lets Connect stays open and running. On the next day when I click on the slider to establish the connection, it happens often that browser opens and I'm asked to enter my credentials. I can do this and then the connection is established. Or I just close Let's connect completly (in the taskbar via right click) and open it again. After that I can connect without entering the credentials.

This happens often when I am to fast after the VM has been restored, then Lets connect shows this message grafik

After closing it and retrying then often (but I think not always?) credentials are asked. But also when waiting a bit longer when this message like in the screenshot doesn't appear there is also asked for credentials. I think this might be a timing problem, because Let's connect still has the old time and that's why it thinks new authentification is required. On the other side I've also managed to connect when VM was suspended for the weekend with Let's connect opened without having to enter credentials.

rozmansi commented 2 years ago

Thank you for reporting this. Will research and get back.

rozmansi commented 2 years ago

I believe the main source of this problem is saving/resuming the VM state is not identical to an ACPI sleep/resume. Saving the VM state and resuming it later is too transparent to the guest Windows: it doesn't notice the gap, it doesn't broadcast any sleep/resume message... Everything - including RTC and NIC - just pauses.

After resuming, the VMs' clock is wrong, network is down... On my VirtualBox Windows guest, it takes a few seconds to fix the clock and about ten seconds to restore connectivity.

Correct computer clock and working network are two basic requirements for the eduVPN and Let's Connect! clients to work correctly.

This happens often when I am to fast after the VM has been restored, then Lets connect shows this message

I was able to reproduce this: when I attempt to make a connection immediately after resuming VM and before VMs' clock is fixed. Fixing the clock probably triggers stale timers making .NET WebRequest stack invalidate its sockets.

On the next day when I click on the slider to establish the connection, it happens often that browser opens and I'm asked to enter my credentials. I can do this and then the connection is established. Or I just close Let's connect completly (in the taskbar via right click) and open it again. After that I can connect without entering the credentials.

I managed to reproduce this once too. I had to hit the right spot... The eduVPN/Let's Connect! server issues OAuth tokens that expire in 1 hour. Providing the clock is correct at least so much to make the client realize it needs to refresh the OAuth token, but the token refresh HTTPS request fails because the network is not working yet, the OAuth token is dropped - requiring fresh authentication. The OAuth token refresh is unfortunately the first victim of the not-yet-working network stack.

Taking the longer procedure (closing the client and reopening) might just delay you enough from connecting for the clock and network connectivity to get fixed.

I think this might be a timing problem, because Let's connect still has the old time and that's why it thinks new authentification is required. On the other side I've also managed to connect when VM was suspended for the weekend with Let's connect opened without having to enter credentials.

After resume from saved VM state, can you afford to wait for the guest VM to fix the clock and networking before attempting to connect?

JsBergbau commented 2 years ago

Thanks for looking into this and I'm really glad that you managed to reproduce it.

Usually I wait some seconds until network is available. Previously I was too fast and that Lets connect client showed an error message that network is not available. Today, if I remember correctly, I've waited about 10 seconds after resume and still I was prompted for credentails again. Since the browser can prompt for credentials and almost immediately opens, I can't imagine that it happens so often to me that I click the connect slider in that instant when network isn't available yet, but a few instants later it is, when browser opens login page.

JsBergbau commented 2 years ago

I've opened a command prompt ping 8.8.8.8 -T I'll wait pressing the connect slider until there is a successful reply and will share if this works better.

rozmansi commented 2 years ago

Since the browser can prompt for credentials and almost immediately opens, I can't imagine that it happens so often to me that I click the connect slider in that instant when network isn't available yet, but a few instants later it is, when browser opens login page.

Browsers have much more resilient HTTP client that this .NET Framework WebRequest silly thing we are using will ever be. 😒 Some browsers even use their own lookup services, so they don't rely on your local DNS.

rozmansi commented 2 years ago

I've opened a command prompt ping 8.8.8.8 -T I'll wait pressing the connect slider until there is a successful reply and will share if this works better.

Perhaps for /l %i in (0,0,1) do ping www.google.com would be a better networking connectivity test because:

  1. It uses hostname rather than IP - implicitly checks DNS resolving works
  2. ping -t would resolve the name only once. Looping ping resolves once every 4 pings.

Nevertheless, I am thinking about adding some small retry (e.g. 3 times) to every failed web request client makes.

JsBergbau commented 2 years ago

Very strange. Since the cmd with pinging 8.8.8.8 is running, the error didn't occur again . After resuming the VM only 3-5 pings are dropped. After clicking the connect slider as soon as there is a successful ping, it connects flawlessly. Before the ping window I've waited even longer before clicking connect to avoid authorization window.

So in summary: When having cmd open pinging 8.8.8.8 the problem doesn't really exist. When this ping isn't running there is the authorization issue.

rozmansi commented 2 years ago

I have a theory here... Pings wake up the networking stack and make Windows regain network consciousness faster than just idling after VM restore. In eduVPN, it is the first HTTP request that does this - namely the OAuth token refresh attempt. If it is the first network connection attempt after VM restore, it will likely fail, but trigger network stack re-initialization.

I have introduced 5 retries in 1 second intervals (rough equivalent of 3-5 pings) should WebRequest fail and testing to see if this makes the problem go away - without a need for ping 8.8.8.8. Stay tuned...

rozmansi commented 2 years ago

I have released 3.3.2 which should now work after VM resume. (No ping required.)

Note there is only one scenario when it will still trigger a needless reauthorization:

  1. Connect to your server and disconnect. Leave client open.
  2. Save VM state and turn it off.
  3. Wait for >1 hour.
  4. Restore VM state and attempt to reconnect eduVPN client before the time is corrected. This makes eduVPN client believe that its OAuth token is still valid and proceeds to connect to the server. Client gets refused with Access Denied which client interprets as its OAuth token is bad and triggers a new authorization.

Therefore: the time needs to be correct at least that much the client realizes its OAuth token needs a refresh.

JsBergbau commented 2 years ago

Thank you very much for fixing this so fast.

I've just installed the new version and will give Feeback in about a week.

JsBergbau commented 2 years ago

Thank you very much for the update. It works now as expected.