e-kotov / rJavaEnv

Java Environments for R Projects
http://www.ekotov.pro/rJavaEnv/
Other
12 stars 1 forks source link

Interface that doesn't affect `.Rprofile` #44

Closed hadley closed 3 months ago

hadley commented 3 months ago

I think it would be useful to have an interface that doesn't affect the .Rprofile, since as soon as you put anything in that file your user ~/.Rprofile will not be run. I'm imagining something like rJavaEnv::use_java(21) which would download java 21 and then set the appropriate environment variables.

Since, as you say, it's not possible to switch java versions within a session, this function might need to error if JAVA_HOME is already set.

e-kotov commented 3 months ago

Hi @hadley , indeed, I did not think of that. But that's also how renv operates, doesn't it? It creates .Rprofile in the current project and therefore also prevents ~/.Rprofile from being sourced. Not blaming renv, but I was going for replicating the behaviour and did not think this through myself.

I guess one way to easily fix this, is to check if ~/.Rprofile exists and add

source("~/.Rprofile")

to the .Rprofile in the current project/working directory. Seems to me that should fix things. Though if a user creates ~/.Rprofile later, that would fail, but then we can add the file.exists() check in the .Rprofile. Problem solved, I guess. Though still not clear if we guessed what the user wanted.

As for the functions, There is a function rJavaEnv::java_env_set() that does not affect .Rprofile if we run it as:

rJavaEnv::java_env_set("session", java_home = "path/to/java")

And we acquire this "path/to/java" from java_install(), which itself gets the distribution path from java_download(). The steps are detailed here: https://www.ekotov.pro/rJavaEnv/articles/rJavaEnv-step-by-step.html . And this is intended for advanced users.

Also, java_install() has autoset_java_env argument that sets the installed Java in both current session and .Rprofile if TRUE. If FALSE a user can use java_env_set() as mentioned above. Again, this is intended for advanced users who want to control everything.

Your suggestion rJavaEnv::use_java(21) is close to what i have now in rJavaEnv::java_quick_install(21). It calls java_install() with autoset_java_env=TRUE and therefore sets the env vars in current session and in .Rprofile. And that last bit is perhaps undesirable...

What I did not get from your suggestion, is where do you expect rJavaEnv::use_java(21) to set the env vars, if not in .Rprofile (or perhaps in .Renviron..?)? Just in the current session? Then I would see rJavaEnv::use_java(21) to act like that:

  1. Check if requested java distrib is downloaded/cached
  2. Download disturb if missing
  3. Extract and link to current/project if not linked yet
  4. Set env vars for this java in current session only

In that case, we only go through steps 1-3 once on the first run and on the next runs of the script with rJavaEnv::use_java(21) we only really set the env vars in the current session, which should be fast. So that would be a less awkward alternative to manually setting paths for java_env_set above. Is that something you had in mind?

e-kotov commented 3 months ago

Since, as you say, it's not possible to switch java versions within a session, this function might need to error if JAVA_HOME is already set.

This is only partially true. Until JAVA is initialised for the first time in the session with rJava::.jinit(), either directly by the user, or by some function in rJava-dependent package, we can change JAVA_HOME as much as we like. But indeed, I should consider this scenario for use cases with targets and/or callr or other cases, when the script starts a child R process and will first pick up the .Rprofile from the working dir.

I think we should be fine, as long as this pro-user (as I suspect someone using targets and/or callr is rather advanced) uses rJavaEnv::java_env_set("session", java_home = "path/to/java") in the beginning of the script before they run anything that triggers Java initialisation. But I will think more about this... Sounds like your suggestion rJavaEnv::use_java(21) would fit well with this use case, if you did mean what I described in the end of the previous comment.

I was actually planning to write a new vignette on how to use rJavaEnv with targets, so when I do, I will think more about better design.

hadley commented 3 months ago

Yes, you've correctly imputed exactly how I want use_java() to work 😄.

For a little more context, I'm not convinced that renv is the model that you want to follow because I think there are two important cases for java config:

(Instead of looking at the JAVA_HOME env var you could instead look at rJava:::.jniInitialized, then you could generate a nice error message if the user attempts to reset. You are right that this is something that only advanced users are likely to try deliberately, but it's easy to do by accident and it's always nice to be able to provide an clear and actionable error message).

e-kotov commented 3 months ago

Yes, you've correctly imputed exactly how I want use_java() to work 😄.

  • Give me a snippet of code I can put at the top of a script so that it works when anyone runs it.

👍 nice, I like this idea and feature request, putting it on my to-do list.

@hadley out of curiosity, how did you find about the package? 😳 It only went live yesterday and I did not even have the time to do any promotion.

hadley commented 3 months ago

@topepo mentioned it on our internal slack.

e-kotov commented 3 months ago

Instead of looking at the JAVA_HOME env var you could instead look at rJava:::.jniInitialized

great tip, thanks!

hadley commented 3 months ago

To avoid CRAN check notes about :::, you can use getFromNamespace(".jniInitialized", "rJava"), but you should write the code defensively so it doesn't error if that variable is no longer defined. I'm happy to review a PR if you want.

e-kotov commented 3 months ago

I'm happy to review a PR if you want.

Sure, here's one https://github.com/e-kotov/rJavaEnv/pull/46

e-kotov commented 3 months ago

@hadley latest github version from main branch should work as expected, including the warning if {rJava} is already initialised. But not an error, as we don't know if the user is going to use {rJava}. Maybe it was initialised, but in fact the user will use a package that uses Java from command line with system/system2 calls, like {opentripplanner} does, which is not affected by {rJava} init.

e-kotov commented 3 months ago

@hadley the updated version with no-consent use_java() is on CRAN now.

hadley commented 3 months ago

Great, thanks!

e-kotov commented 2 months ago

(Instead of looking at the JAVA_HOME env var you could instead look at rJava:::.jniInitialized, then you could generate a nice error message if the user attempts to reset. You are right that this is something that only advanced users are likely to try deliberately, but it's easy to do by accident and it's always nice to be able to provide an clear and actionable error message).

Apparently, using rJava:::.jniInitialized turned out to be a bad idea. Unfortunately, when calling rJava:::.jniInitialized (or also with getFromNamespace(".jniInitialized", "rJava")), that triggered some kind of pre-initialization of Java paths without initialising {rJava}... So any consequent initialisation of {rJava} ignored any new JAVA_HOME paths. rJavaEnv::java_check_version_rjava() passed and provided the correct Java version after use_java(21), but this was because rJavaEnv::java_check_version_rjava() ran the check in a clean sub-process, not in the current R session. Meanwhile, after setting Java in the current session with use_java(21) or withjava_quick_install(21)did not allow any actual{rJava}`-dependent package to use this version 21 or any other, except for the one that was detected at R session start.

@hadley so FYI, there's a fix on CRAN v0.2.2, highly recommended update. That new version only checks if {rJava} is installed at all and if it is loaded. If {rJava} is loaded, there is a chance that the user already tried to initialise it and they get a warning about it. If {rJava} is installed but is not attached, then the user has definitely NOT initialised it and therefore no warning is provided.

hadley commented 2 months ago

Oooh I forgot that would trigger rJava:::.onLoad(). I think as you figured out you could avoid that problem by first calling isNamespaceLoaded("rjava").

e-kotov commented 2 months ago

Oooh I forgot that would trigger rJava:::.onLoad(). I think as you figured out you could avoid that problem by first calling isNamespaceLoaded("rjava").

@hadley well, I'm using "rJava" %in% loadedNamespaces() == TRUE

https://github.com/e-kotov/rJavaEnv/blob/5bc3feaad84155964037fca7797ccef564fd319f/R/java_env.R#L87-L92

but according to my tests, there is no way to avoid this lock on paths after touching the rJava:::.jniInitialized by any means. Unless, I somehow force-detach {rJava} after rJava:::.jniInitialized... I will have a look and maybe implement the init-check in the next release if this works out, but for now it seemed safer for me not to touch it at all and only check if {rJava} was attached/loaded.

hadley commented 2 months ago

Yeah, you can simplify both if conditions to just isNamespaceLoaded("rjava").