rstudio / reticulate

R Interface to Python
https://rstudio.github.io/reticulate
Apache License 2.0
1.68k stars 328 forks source link

questions about use_condaenv() and whether or not its safe to use the full conda environment? #1015

Open mrjc42 opened 3 years ago

mrjc42 commented 3 years ago

Hi,

So a number of us have noticed that after we call use_condaenv(), we get access not only to the python that is present in that environment, but ALSO access to anything custom from the local command line (providing that we 1st activate that python so that it gets lazily bound to our session).

And: this is actually extremely powerful. Because it means that we can install CLI tools into conda environments and then get access to them from RStudio notebooks. We can run some R code, then process a file using anything that runs in linux, and then have another step that continues from that point on... etc.

BUT: there are some things that lead me to think that this is awesomeness is actually an accidental 'side effect' of how reticulate is binding python. For one thing, I cannot ever let go of an environment once I have bound it onto a specific python. After I have called both use_condaenv() and anything from python, there does not seem to be any way to tell reticulate to 'let go' so that I could attach a different environment later on in my notebook. And the other issue is (of course), that I have no way to sidestep the lazy binding mechanic for Python. IOW, when using this "feature", I normally want to just tell the session that I am both keen to us the environment and ALSO that I mean it in one command. Currently, have to both indicate that I want the environment using use_condaenv() and then separately also call some python code before I can get access to the other stuff in the environment. The fact that this is inconvenient like this seems like a big hint that only Python was intended to be supported in this way. And it also suggests that by using this "feature" I might be asking for trouble down the road...

But if I am right and this was not the intention. Then I have to say that this more limited design seems unambitious. The real world of data science is really MUCH bigger than just R and python. So here are my questions:

1) Am I right about this "feature" being just an accidental side effect? 2) If I am right about that, is there any chance you guys might embrace making this a real feature? 3) What should I tell all the people that I know who want to use this "feature"? Should I just them to stop using RStudio and instead look more closely at toolchains like SoS? Or are you guys already thinking about supporting it?

I know that you guys are probably always busy. But it seems like this might be an opportunity to reach a much wider audience. You currently have a MUCH nicer IDE than tools like Jupyter are providing. So I don't want to migrate/train people away from it if we don't have to. But people also need to be able to use a greater diversity of tools.

Curious to hear your thoughts.

Cheers,

Marc

kevinushey commented 3 years ago

This is intentional: after reticulate has bound itself to a particular Python instance, it is unable to unload and reload a separate Python instance within the same session -- it is necessary to restart the R session if you'd like to use a different copy of Python.

If you want to use multiple versions of Python within the same R instance, you could do so by (for example) running the code you'd like to execute within an R sub-process; e.g. using the callr package. (You could have each separate R sub-process bind to their own separate copies of Python.)

mrjc42 commented 3 years ago

Thanks Kevin,

So when you say it's "intentional" are you referring only to the unbinding/binding of python or do you also mean the way that use_condaenv() gives me awesome access to other CLI tools (ones that have nothing to do with Python) in that conda environment? Here I am looking to discern your intentions for how we use the software with respect to "other" cli tools...

Thanks again,

Marc

kevinushey commented 3 years ago

In this case, both -- the intention is that binding reticulate to an Anaconda environment should behave similarly to what a user might expect via calling conda activate in a terminal session to use a particular Anaconda Python instance.

mrjc42 commented 3 years ago

Awesome! Thank you for clarifying that.

mrjc42 commented 3 years ago

One more question: is there any way to activate one of these environments WITHOUT binding to Python?