allenai / ScienceWorld

ScienceWorld is a text-based virtual environment centered around accomplishing tasks from the standardized elementary science curriculum.
https://sciworld.apps.allenai.org/
Apache License 2.0
199 stars 24 forks source link

Use py4j launch_gateway to launch Java server. #30

Closed MarcCote closed 1 year ago

MarcCote commented 1 year ago

Right now, when creating a ScienceWorldEnv object, a Java server needs to be started via a call to subprocess.Popen .

This PR proposes to use py4j's helper function to launch and manage the java server on its own. The benefits are two folds:

  1. There's no need to manually assign a port. The proposed code will automatically find one available and use it.
  2. When the main Python process terminates, crashes, or gets interrupted, the associated java process is automatically killed hence avoiding creating zombie processes.

This PR contains breaking changes. Instantiating a ScienceWorldEnv object no longer needs threadNum, nor launchServer.
Also, this PR makes issue #21 irrelevant.

I based my code on https://www.py4j.org/advanced_topics.html#using-py4j-without-pre-determined-ports-dynamic-port-number

aphedges commented 1 year ago

Thank you very much for this work! I've been running off this branch for the past week or so, and it's been working well. I've been able to remove heuristics used to choose ports, and there are no longer any zombie Java processes. The change to package.sh will be very useful as well and you got to it before I could do something similar. I'm looking forward to this being merged.

This is a minor problem, but I wonder if you know how to use a debugger with this. I was previously using the following patch, but it doesn't work anymore with the new architecture:

diff --git a/scienceworld/scienceworld.py b/scienceworld/scienceworld.py
index 4291c39..20655ba 100644
--- a/scienceworld/scienceworld.py
+++ b/scienceworld/scienceworld.py
@@ -67,7 +67,7 @@ class ScienceWorldEnv:
     # Launches the PY4J server
     def launchServer(self, serverPath):
         print("Launching ScienceWorld Server (Port " + str(self.portNum) + ") -- this may take a moment.")
-        cmd = "nohup java -cp " + serverPath + " scienceworld.runtime.pythonapi.PythonInterface " + str(self.portNum) + " >/dev/null 2>&1 &"
+        cmd = "nohup java -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5005 -cp " + serverPath + " scienceworld.runtime.pythonapi.PythonInterface " + str(self.portNum) + " >/dev/null 2>&1 &"

         subprocess.Popen(cmd, cwd=BASEPATH, shell=True)
         # The sleep command here is to give time for the server process to spawn.
MarcCote commented 1 year ago

I got the green light from Peter and I was planning to merge it today :). I'm afraid you will have to rebase your PRs though.

That's a debugger for the JVM? I'll test it out later today, to see what we can do about it.

aphedges commented 1 year ago

I got the green light from Peter and I was planning to merge it today :).

Good to hear!

I'm afraid you will have to rebase your PRs though.

I'm not surprised. I can do it easily, though.

That's a debugger for the JVM? I'll test it out later today, to see what we can do about it.

I don't entirely understand it because I'm not very familiar with the Java and Scala ecosystems, but I believe it sets up a built-in JDK debug server that I can attach an external debugger (e.g., IntelliJ) to. It's better than print-line debugging, but if you don't know how to handle it, don't worry about getting it to work. I was just hoping it's something you had figured out already.