Open amichuda opened 3 years ago
Well, if it actually used any of the code here they'd have to publish it, right? I'm assuming they must have re-done it from scratch.
Yes, I think that's why they used the word "inspired," because otherwise I think you'd have a lawsuit against then if they used your code in a closed source software right (not a lawyer, so have no idea).
But not even sure how the software is being handled.
This is interesting. Some thoughts:
Yeah probably not using any of our code. Don't think they'd be dumb enough to include a GPL3 dependency in their code...
It's hard to know much about their code because only their stata_setup
module is public
Because it's not pip-installable, you need to tell Python users to change their PYTHONPATH
every time so they can import the pacakge. I'm sure there'll be a ton of support requests of Python not finding the pystata
import
It doesn't define a Jupyter kernel. Instead it uses plain Python and just defines a few IPython magics. But the code is all running inside the regular Python kernel.
Since it doesn't define its own kernel, I'm curious if they're able to maintain data state on the Stata side. From stata_setup
and their example, it looks like they maintain a running Stata session. Is it in a subprocess? How does Stata keep data in sync on the Python and Stata sides? Seems like it would be a pain in the butt for users to have to keep track of "is my Python data the same as my Stata data"
sys.path.append(os.path.join(path, 'utilities'))
from pystata import config
config.init(edition)
I assume data isn't persisted on the Python side and sent to Stata every time a Stata command is called... That would be a horribly slow experience for large data.
It's curious that they're integrating so much with Python... I imagine (hope) there will be some users for whom this introduces them more to Python's huge data science ecosystem, and then say "why am I paying so much for Stata, when I see I can do everything I need here in Python". But clearly StataCorp thinks this integration will be positive for them 🤷♂️
@kylebarron I would assume it's using the existing python interface they introduced in Stata 16? My assumption is that the python data and Stata data are separate; I don't see how else it could work, at least from skimming the docs. My assumption is:
They might use frames to cache some data but I can't imagine they by default copy every data created in python into Stata and the converse (i.e. without the user telling the kernel to do it).
- It's curious that they're integrating so much with Python... I imagine (hope) there will be some users for whom this introduces them more to Python's huge data science ecosystem, and then say "why am I paying so much for Stata, when I see I can do everything I need here in Python". But clearly StataCorp thinks this integration will be positive for them man_shrugging
For their core demo, which is social scientists with high switching costs, I don't know this will make such a big difference either way. I assume Stata is betting that this will encourage enough newcomers to stick around. At least the ones that don't might, as you say, get exposed to Python instead of unhappily languishing in Stata.
I have this on order and will report back on how things are happening. Two comments unrelated to the inner workings of the Stata Corp python module:
Have been testing this out this morning (on linux) having just upgraded to Stata 17. Observations:
ps -ef | grep stata
doesn't show anything. %%stata -d some_dataframe_from_python
creates a static copy of the python object/data that is not updated if the underlying python data changes. The only advantage of the stata corp way is the mixing of stata and python in a single notebook, which I don't believe is possible with stata_kernel
.
Stata 17 has the
pystata
package which lets users run Stata from python. Guess who they acknowledged?!https://www.stata.com/python/pystata/ack.html
I think the package is closed source, so they didn't really follow the spirit of your package, but still pretty cool!
Once again, really great job on this package, from what I've seen in my research and other institutions (at least in economic development work), the stata kernel has made a splash!