kylebarron / stata_kernel

A Jupyter kernel for Stata. Works with Windows, macOS, and Linux.
https://kylebarron.dev/stata_kernel/
GNU General Public License v3.0
266 stars 57 forks source link

Shout-out from Stata Corp. #387

Open amichuda opened 3 years ago

amichuda commented 3 years ago

Stata 17 has the pystata package which lets users run Stata from python. Guess who they acknowledged?!

https://www.stata.com/python/pystata/ack.html

I think the package is closed source, so they didn't really follow the spirit of your package, but still pretty cool!

Once again, really great job on this package, from what I've seen in my research and other institutions (at least in economic development work), the stata kernel has made a splash!

mcaceresb commented 3 years ago

Well, if it actually used any of the code here they'd have to publish it, right? I'm assuming they must have re-done it from scratch.

amichuda commented 3 years ago

Yes, I think that's why they used the word "inspired," because otherwise I think you'd have a lawsuit against then if they used your code in a closed source software right (not a lawyer, so have no idea).

But not even sure how the software is being handled.

kylebarron commented 3 years ago

This is interesting. Some thoughts:

mcaceresb commented 3 years ago

@kylebarron I would assume it's using the existing python interface they introduced in Stata 16? My assumption is that the python data and Stata data are separate; I don't see how else it could work, at least from skimming the docs. My assumption is:

They might use frames to cache some data but I can't imagine they by default copy every data created in python into Stata and the converse (i.e. without the user telling the kernel to do it).

mcaceresb commented 3 years ago
  • It's curious that they're integrating so much with Python... I imagine (hope) there will be some users for whom this introduces them more to Python's huge data science ecosystem, and then say "why am I paying so much for Stata, when I see I can do everything I need here in Python". But clearly StataCorp thinks this integration will be positive for them man_shrugging

For their core demo, which is social scientists with high switching costs, I don't know this will make such a big difference either way. I assume Stata is betting that this will encourage enough newcomers to stick around. At least the ones that don't might, as you say, get exposed to Python instead of unhappily languishing in Stata.

roblem commented 3 years ago

I have this on order and will report back on how things are happening. Two comments unrelated to the inner workings of the Stata Corp python module:

  1. The new pricing model which requires annual subscriptions is ridiculously expensive
  2. In my own research I never use Stata (use tensorflow and jax) but for the models I teach in an upper level econometrics class, statsmodels isn't there yet and R is too clunky as every Model we cover has a different API which isn't going to work for my students. Since all of my colleagues use and teach with Stata, it would be unfair to students for me to force another workfow on them, although with the new prices I am having a difficult time seeing how my university can pay for all of these subscriptions.
roblem commented 3 years ago

Have been testing this out this morning (on linux) having just upgraded to Stata 17. Observations:

  1. No syntax highlighting (although fenced stata codeblocks in markdown cells are highlighted)
  2. No completions
  3. Stata must be running as a background process since variables and the dataset exist across codeblocks, although a ps -ef | grep stata doesn't show anything.
  4. Copying python data into stata using %%stata -d some_dataframe_from_python creates a static copy of the python object/data that is not updated if the underlying python data changes.

The only advantage of the stata corp way is the mixing of stata and python in a single notebook, which I don't believe is possible with stata_kernel.