tmthyjames / SQLCell

SQLCell is a magic function for the Jupyter Notebook that executes raw, parallel, parameterized SQL queries with the ability to accept Python values as parameters and assign output data to Python variables while concurrently running Python code. And *much* more.
MIT License
150 stars 11 forks source link

Pandas Integration #94

Closed Tooblippe closed 6 years ago

Tooblippe commented 6 years ago

Hi, great work. This can become something great.

Is pandas integration done? How would one obtain the results in a dataframe?

tmthyjames commented 6 years ago

Hey thanks!

Yes pandas integration is done, as in you can return the data as a dataframe and call a pandas method on it.

For example, to get a dataframe you'd do:

%%sql MAKE_GLOBAL=my_results
SELECT * FROM my_table

And now my_results should be a pandas dataframe.

Let me know if something isn't working for you and I'll try to resolve the issue.

To get the raw data (that is the SQLAlchemy result set as a list of named tuples) you'd do:

%%sql MAKE_GLOBAL=my_results RAW=True
SELECT * FROM my_table

However, with #93 I'm thinking of changing this and removing the RAW parameter altogether. How it would work is the user would use MAKE_GLOBAL to return the raw data and if the user wanted a dataframe they'd just do

my_results.df

where .df is a property of a class I created that inherits from list and converts the raw data to a dataframe on the spot instead of having to run the query again to get a dataframe (or importing pandas and converting it to a dataframe manually).

tmthyjames commented 6 years ago

@Tooblippe if you don't have any other concerns about this feature then I'll close for now.