Closed amitu closed 4 years ago
Yes, definitely. I'm still trying to figure out the best way to manage the process.
I want to set a time limit too so that expensive queries don't cause performance problems.
Useful demo (in ipython):
import asyncio
proc = await asyncio.create_subprocess_shell(
"sleep 5; echo 'hello'; " * 5,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE
)
while True:
print(await proc.stdout.readline())
This works as expected for the first 25 seconds, then goes into an infinite loop outputting b''
proc.kill()
seems to do the right thing against this too.
I need asyncio.wait_for()
for a time limit:
https://docs.python.org/3/library/asyncio-task.html#asyncio.wait_for
try:
await asyncio.wait_for(eternity(), timeout=1.0)
except asyncio.TimeoutError:
print('timeout!')
import asyncio
proc = await asyncio.create_subprocess_shell(
"sleep 5; echo 'hello'; " * 5,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE
)
try:
await asyncio.wait_for(proc.stdout.readline(), timeout=1.0)
except asyncio.TimeoutError:
print('timeout!')
proc.kill()
This seems to do the job in local testing:
async def run_ripgrep(pattern, path, time_limit=3.0, max_lines=1000):
proc = await asyncio.create_subprocess_exec(
"rg",
pattern,
path,
"--json",
stdout=asyncio.subprocess.PIPE,
stdin=asyncio.subprocess.PIPE,
)
async def inner(results):
while True:
line = await proc.stdout.readline()
if line == b'':
break
results.append(json.loads(line))
if len(results) > max_lines:
break
results = []
time_limit_hit = False
try:
await asyncio.wait_for(inner(results), timeout=time_limit)
except asyncio.TimeoutError:
time_limit_hit = True
proc.kill()
# We should have accumulated some results anyway
return results, time_limit_hit
You stop reading after
max_lines
, but you let the process to run and wait for it to finish.I haven't run it, or Datasette for that matter, just was curious, and thought I would get this clarified.