civisanalytics / civis-python

Civis API Python Client
BSD 3-Clause "New" or "Revised" License
34 stars 26 forks source link

Reference SQL script from io.read___ #335

Closed ghost closed 4 years ago

ghost commented 4 years ago

Hey there,

Trying to keep my code redundancy in check.

Currently, I'm copy/pasting SQL code into Jupyter notebooks and running it with civis.io.read_civis_sql(). I assign that to a variable sql which gets passed into the first argument (incorrectly marked 'optional' by the way).

https://github.com/civisanalytics/civis-python/blob/945ca4c94aee5200ba7f28145556bec30e0e46fa/civis/io/_tables.py#L221-L223

Really expected to see some sort of functionality that would let me pass either the name of a saved script that I've got access to, or the ID/Slug from the URL that I'd use in, say, the Job Scheduler

image

Am I missing this somewhere? Or is this functionality not yet available?

Thanks!

elsander commented 4 years ago

Hi! There is functionality to kick off SQL jobs. If you have an existing SQL script, you can run something like this:

from civis import APIClient

my_sql_job_id = '123456'
client = APIClient()
client.scripts.post_sql_runs(my_sql_job_id)

Here's the relevant documentation: https://civis-python.readthedocs.io/en/latest/api_resources.html#civis.resources._resources.Scripts.post_sql_runs

Does that address what you're trying to do?

elsander commented 4 years ago

I'll also make a fix for that incorrect "optional" argument in the documentation that you pointed out. Thanks for reporting!

elsander commented 4 years ago

There isn't a single convenience function to do this, but you can do it without interacting with the path directly. Here's a code snippet:

import civis
client = civis.APIClient()
script_id = 123456

response = client.scripts.post_sql_runs(id=script_id)
# create a future to check when the sql script finishes
future = civis.futures.CivisFuture(client.scripts.get_sql_runs, (script_id, response.id))
# wait for the sql script to finish, then get the file id for the output file
output_file_id = future.result()['output'][0]['file_id']
df = civis.io.file_to_dataframe(output_file_id)
elsander commented 4 years ago

Happy to help! I'll close this ticket.

elsander commented 4 years ago

In case it's helpful, the documentation on futures has more details: https://civis-python.readthedocs.io/en/stable/user_guide.html#civis-futures