kylebarron / stata_kernel

A Jupyter kernel for Stata. Works with Windows, macOS, and Linux.
https://kylebarron.dev/stata_kernel/
GNU General Public License v3.0
266 stars 57 forks source link

Calling Python functions #325

Open arnold-c opened 4 years ago

arnold-c commented 4 years ago

Hi,

This isn't so much a feature request, and more of a question. Is there currently a way to use python in select instances whilst using stata_kernel? For example, when using ipystata in a Python3 kernel, it is possible to use the %%stata magic to use a stata command in a particular cell? I know the SoS notebook exists for multiple language noteboks, but if there was an inbuilt magic in the kernel so that graphs could be built by seaborn in Python, for example, that would be a much better workflow when the majority of the analysis is completed in stata.

Thanks, Callum

kylebarron commented 4 years ago

That was never a goal with this project. This project keeps all the data in Stata, enabling you to work with large amounts of data without having to move data back and forth between Stata and Python after every cell as ipystata does.

All of the magics are essentially custom Python commands that bring data into Python, reformat it in some way, and then display it to the user. Since the kernel itself is written in Python, the two ways to work with data are either to 1) run a Stata command, record the text output, and convey that text to the user, or 2) move the data to Python, work with the data in Python, and generate output text from Python to send to the user.

Magics like %%head stay fast by moving very small amounts of data as the default. Otherwise, if you had a 100GB dataset in Stata, and you tried to %%head the whole thing, it would have to write the entire dataset to disk, read it all in Python, and print it in the kernel.

So because there's no good way to do general data work in Python without moving all the data to Python, this functionality was never built into the kernel, and is best as another script that works with data exported from Stata.