Closed OliviaViessmann closed 1 year ago
Hi @OliviaViessmann I'm trying to take a look at this but I'm struggling to get Pytorch3d working on my dev machine.
Did you run into any issues?
Nope, I didn't. It runs fine for me. No issues with Pytorch3d on my end.
If you checkout the PR the ports should now be configurable by the ProteinMeshConfig
. I suppose you could zip the configs with the relevant ports with the PDBs and pass them both as args to create_mesh
.
Hi a-r-j,
thanks a ton for looking into this and making the adaptions. I am running on the new PR and configured a port, but I think PORT=9123
is still hard coded somewhere. Did it work for you? Am I missing anything?
Here is the code snippet I use:
pymol_commands = {"pymol_commands": ["set surface_quality, 2", "show surface"]}
pymol_config = ProteinMeshConfig(**pymol_commands, pymol_port=9999)
verts_x, faces_x, aux = create_mesh(pdb_file=pdb_file_x, config=pymol_config)
I put a print statement in get_obj_file()
to double check the port is set, but somewhere it spins up a pymol session with default setting, because I get:
xml-rpc server running on host localhost, port 9123
A PyMOL RPC server is already running.
xml-rpc server running on host localhost, port 9123
Yep, I missed a spot!
Thanks!!
Has this resolved the issue @OliviaViessmann? If so, I will merge the PR shortly.
Also, if you could share a short snippet I can turn into a test that would be super helpful :)
It is working 50/50. It now does run in parallel, but it does not run on the ports specified, but increments from 9123 up. Here is a minimum snippet with port printouts
import socket
def is_port_in_use(port: int) -> bool:
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
return s.connect_ex(("localhost", port)) == 0
def func(pdb_file: str):
pymol_commands = {
"pymol_commands": [
"show surface", ]
}
port = random.randint(1025, 65535)
while not is_port_in_use(port=port):
port = random.randint(1025, 65535)
print(port)
pymol_config = ProteinMeshConfig(**pymol_commands, port=port)
verts, faces, aux = create_mesh(pdb_file=pdb_file, config=pymol_config)
return verts
def main():
parallel_iter = Parallel(n_jobs=8).it(
delayed(func)(pdb_file) for pdb_file in pdb_files
)
Here is an exemplar prinout of ports and pymol outputs:
6379
xml-rpc server running on host localhost, port 9124
xml-rpc server running on host localhost, port 9125
xml-rpc server running on host localhost, port 9126
xml-rpc server running on host localhost, port 9127
xml-rpc server could not be started
xml-rpc server could not be started
xml-rpc server could not be started
xml-rpc server could not be started
9004
Thanks!! Hmm, I'll try to check it out this week. A quick heads up though: the config param added in #262 is pymol_port
rather than port
Sorry, yes, mistake on my end. I have the correct version running with pymol_port = port
-- just did a crappy job at copy/pasting with manual edit...
I am also printing the port inside the graphein create_mesh()
function with print("pymol port: ", config.pymol_port)
and it is correctly set in there, but it still ramps up servers on the 912x
ports
pymol port: 34873
xml-rpc server could not be started
pymol port: 9007
xml-rpc server running on host localhost, port 9125
pymol port: 9124
xml-rpc server running on host localhost, port 9126
xml-rpc server running on host localhost, port 9125
Thanks for looking into it!
I did some digging and this looks like a pymol limitation, rather than a graphein limitation:
We need to be able to set the port on the pymol listener and, sadly, we don't have easy access to it. Also, the max retries limits the number of servers you can run.
I suppose one way to go is to patch your local pymol install. You could for example set the port via an env var that pymol would read instead of the hardcoded 9123
and make the following modification to the Graphein viewer class:
class MolViewer(object):
def __init__(self, host=HOST, port=PORT):
self.host = host
self.port = int(port)
self._process = None
def __del__(self):
self.stop()
def __getattr__(self, key):
if not self._process_is_running():
self.start(["-cKQ"])
return getattr(self._server, key)
def _process_is_running(self):
return self._process is not None and self._process.poll() is None
def start(self, args=("-Q",), exe="pymol"):
"""Start the PyMOL RPC server and connect to it
Start simple GUI (-xi), suppress all output (-Q):
>>> viewer.start(["-xiQ"])
Start headless (-cK), with some output (-q):
>>> viewer.start(["-cKq"])
"""
if self._process_is_running():
print("A PyMOL RPC server is already running.")
return
assert isinstance(args, (list, tuple))
########################## CHANGE HERE
env = os.environ.copy()
env["PYMOL_XMLRPC_PORT"] = str(self.port)
self._process = subprocess.Popen([exe, "-R"] + list(args), env=env)
########################## END CHANGE
self._server = Server(uri="http://%s:%d/RPC2" % (self.host, self.port))
# wait for the server
while True:
try:
self._server.bg_color("white")
break
except IOError:
time.sleep(0.1)
def stop(self):
if self._process_is_running():
self._process.terminate()
def display(self, width=0, height=0, ray=False, timeout=120):
"""Display PyMol session
:param width: width in pixels (0 uses current viewport)
:param height: height in pixels (0 uses current viewport)
:param ray: use ray tracing (if running PyMOL headless, this parameter
has no effect and ray tracing is always used)
:param timeout: timeout in seconds
Returns
-------
fig : IPython.display.Image
"""
from IPython.display import Image, display
from ipywidgets import IntProgress
progress_max = int((timeout * 20) ** 0.5)
progress = None
filename = tempfile.mktemp(".png")
try:
self._server.png(filename, width, height, -1, int(ray))
for i in range(1, progress_max):
if os.path.exists(filename):
break
if progress is None:
progress = IntProgress(min=0, max=progress_max)
display(progress)
progress.value += 1
time.sleep(i / 10.0)
if not os.path.exists(filename):
raise RuntimeError("timeout exceeded")
return Image(filename)
finally:
if progress is not None:
progress.close()
try:
os.unlink(filename)
except:
pass
Alternatively, I also came across this which seems to be a similar RPC component that reads from an env var.
Ahhhh, the lines you sent totally explain the behaviour about the ports being set between 9123
to 9128
.
Ok, I might try out the local pymol patch. Or give up and manually throw this on a bunch of batch machines. Not sure what will end up being faster :)
Thanks for the workaround suggestion -- if I end up trying it I will report back on it!
Want me to close this as "not planned"?
Keen to hear how it goes :)
Describe the bug I would like to run grapheins
create_mesh()
function in parallel on multiple workers. I assume that I need to spin up multiple pymol sessionsMolViewer()
for each worker specifying a dedicatedPORT
. However I am not sure how to set this from the "outside" -- this might actually be a feature request.To Reproduce Steps to reproduce the behavior: trying to run something like
Parallel(n_jobs = 8).it(create_mesh)(pdb) for pdb in pdbs
This gets stuck if run naively.Expected behavior Want to specify ports for each worker so that I can run pymol sessions on each of them