Maintain a respectful queue of jobs to be run on Quantum Engine

quantumlib / Cirq

A Python framework for creating, editing, and invoking Noisy Intermediate Scale Quantum (NISQ) circuits.

Apache License 2.0

4.29k stars 1.02k forks source link

Maintain a respectful queue of jobs to be run on Quantum Engine #2821

Open mpharrigan opened 4 years ago

mpharrigan commented 4 years ago

Sometimes you have a whole host of jobs to run. Instead of submitting lots of them and filling up the queue, you could do one at a time. But you can save on latency / classical processing overhead by keeping a respectful queue. I've been using this function

async def execute_in_queue(func, params, num_workers: int):
    queue = asyncio.Queue()

    async def worker():
        while True:
            param = await queue.get()
            print(f"Processing {param}. Current queue size: {queue.qsize()}")
            await func(param)
            print(f"{param} completed")
            queue.task_done()

    tasks = [asyncio.create_task(worker()) for _ in range(num_workers)]
    for param in params:
        await queue.put(param)
    print("Added everything to the queue. Current queue size: {}".format(queue.qsize()))
    await queue.join()
    for task in tasks:
        task.cancel()
    await asyncio.gather(*tasks, return_exceptions=True)

Would something like this be welcome inside cirq.google?

dabacon commented 4 years ago

Yes!

kevinsung commented 4 years ago

Can't we achieve the same thing using concurrent.futures.ThreadPoolExecutor? I.e.,

from concurrent.futures import ThreadPoolExecutor

with ThreadPoolExecutor(max_workers=num_workers) as executor:
    for param in params:
        executor.submit(func, *param)

mpharrigan commented 4 years ago

I swear I scoured the python async docs looking for something like this!

kevinsung commented 4 years ago

I didn't know about this until @mrwojtek showed me the other day :stuck_out_tongue: .

mrwojtek commented 4 years ago

The concucrrent.futures.ThreadPoolExecutor is a concept that is conceptually orthogonal to asyncio library and they aim to solve slightly different problem. By design, asyncio is a single threaded library which allows for concurrent execution of python code. ThreadPoolExecutor allows for actual parallel execution where different functions run on different threads. This doesn't matter much for Python code since it's protected by a global lock anyway but matters a lot in our cases: where network I/O happens.

Matthew, your code looks nice but it is dependent on how does func callback is implemented. If all func instances run on the same thread, they'll block on the same I/O operation. Could you give an example of how "func" looks like in your cases?

balopat commented 4 years ago

I'd be curious to see what you put in func as well, @mpharrigan - the parallel execution will help only when we are network bound / waiting on the remote service's response. Do you have stats on this? Before introducing any parallel/concurrent processing primitive we should figure out under what circumstances it helps and whether we can leverage existing features. It would be interesting to study the latency / throughput of our jobs depending on job sizes (circuit depth, parameters, measurements, etc).

mpharrigan commented 4 years ago

func is a call to quantum engine. You need a variant of EngineSampler that has an async run method, where instead of blocking polling for a job to complete, you yield. When I had a bunch of 50k shot jobs to run, it gave almost a 2x speedup (heuristic) through a combination of keeping the engine-side queue warm and pipelining client-side classical processing. Now, batching might give you a bigger performance boost but still puts the onus on the developer to batch circuits into appropriately sized chunks and doesn't pipeline the client-side processing.

Really, it would be sweet if we had an auto-batcher that uses async trickery to build up a local queue until it collects enough jobs to batch up and send

MichaelBroughton commented 2 years ago

Has this been superceded by feature testbed stuff @mpharrigan ?

mpharrigan commented 2 years ago

That would be a natural place for it, although it doesn't exist yet. Probably blocked by #5023

dstrain115 commented 9 months ago

@verult Do we think this feature is obsolete now that we have streaming?