microsoft / Quantum

Microsoft Quantum Development Kit Samples
https://docs.microsoft.com/quantum
MIT License
3.86k stars 918 forks source link

`TimeoutError` but azure still connects? #762

Closed ryanhill1 closed 1 year ago

ryanhill1 commented 1 year ago

Describe the bug Running the code samples from Quantum computing with Q# and Python raises multiple TimeoutError's, but if you ignore them and/or run commands multiple times, the program works.

In my example below, the first TimeoutError occurs during qsharp.azure.connect(). If you ignore that error and then run qsharp.azure.target(), you will get another TimeoutError. But if you ignore that again and re-run qsharp.azure.target(), it will go through, and in the end you will be able to execute your program.

To Reproduce

  1. Open Binder linked in README.
  2. Create Test.qs

    namespace Test {
    open Microsoft.Quantum.Intrinsic;
    open Microsoft.Quantum.Measurement;
    open Microsoft.Quantum.Canon;
    
    operation GenerateRandomBits(n : Int) : Result[] {
        use qubits = Qubit[n];
        ApplyToEach(H, qubits);
        return MultiM(qubits);
    }
    }
  3. Create test.py in same directory
    
    import qsharp
    import qsharp.azure
    from Test import GenerateRandomBits
    from time import time

start = time()

try: qsharp.azure.connect( resourceId=f"/subscriptions/.../resourceGroups/AzureQuantum/providers/Microsoft.Quantum/Workspaces/WORKSPACE_NAME", location="West US") except TimeoutError as err: print(err) runtime = round(time() - start, 2) print(f"runtime {runtime}s") print("Ignoring timeout error...\n")

try: qsharp.azure.target("ionq.simulator") except TimeoutError as err: print(err) runtime = round(time() - start, 2) print(f"runtime {runtime}s") print("Ignoring timeout error, trying again...\n") qsharp.azure.target("ionq.simulator")

result = qsharp.azure.execute(GenerateRandomBits, n=3, shots=1000, jobName="Generate three random bits") print(result)

runtime = round(time() - start, 2) print(f"runtime {runtime}s")

4. Run program
```console
$ python test.py

Expected behavior Expect program to run smoothly without any TimeoutError

Screenshots Program output from Binder terminal:

qsharp-binder

System information OS from Binder linked in README.

Additional context TimeoutError does not seem to occur when connecting to azure from Q# kernel, only from python.

xinyi-joffre commented 1 year ago

I repro'd in MyBinder, but I couldn't repro in local container when I pulled down the exact image and ran it locally. There seems to be something in MyBinder environment that is blocking Azure.Identity from finishing for nearly 5 mins.

If you are using Azure CLI for the initial login on the terminal: az login az account set -s ""

Then I think bypassing all other auth methods that Azure.Identity package searches for and just setting CLI seemed to help it not get stuck for 5 mins:

import qsharp.azure targets = qsharp.azure.connect( resourceId="", location="", credential="CLI")

@ryanhill1 , can you see if setting credential="CLI" helps in your case? If it helps you as well, I can open a ticket with Azure.Identity to see if they can help get to bottom of it.

As for why it appeared that TimeoutErrors were transient, it is that the Timeout errors were thrown by python interop's timeouts of 120 seconds and 240 seconds respectively, but the original auth call was still ongoing in the kernel (so underlying IQ# kernel was busy). Once auth finishes behind the scenes, then qsharp.azure.target went through on second retry.

rryoung98 commented 1 year ago

I can confirm that this works in less than 5 seconds on qBraid. Thank you for the workaround. @xinyi-joffre

xinyi-joffre commented 1 year ago

We discovered the issue has to do with ManagedIdentity credential time waiting for a long timeout if the ManagedIdentity endpoint doesn't exist in certain environments. This bug is tracked here for Azure.Identity: Azure/azure-sdk-for-net#24767 Azure/azure-sdk-for-net#29471

Outside of workaround above, you can also pass the same credential type argument for %azure.connect commands (which probably weren't timing out for you, but would have probably also taken a long time due to this issue: %azure.connect "<resourceId>" location="" credential="CLI"

We are working on applying a fix for this in QDK, so no special credential type needs to be passed in the next version of QDK (either this month's release or next month's release)! We will close this issue once it is released.

Thanks for raising this issue!

ryanhill1 commented 1 year ago

@xinyi-joffre Thanks for your help and for your quick response!