Installation and experiment replication [MacOS (M1/ARM)]

RMichae1 commented 1 year ago

Hello,

I'm having trouble replicating the exact environment and results that's described in the readme.md and requirements.txt. Namely, running the commands one-by-one lands in an error at the pip install -r requirements.txt --upgrade. The changes that were required to complete the installation process are the following:

Changing torchvision from 0.11.1 to 0.11.2
Removing the strict requirements from vina. There seems to be a bug on one of their __init__.py.
installing tokenizers fails unless the rust compiler is installed.
Running the examples shows that protobuf needs to be below 3.20.x.

It looks like the majority of these issues are MacOS (>=13.5.*) (M1/ARM) specific and linux64 based system don't have these issues. Here one can install torchvision==0.11.1, vina as specified. Though the protobuf error also occurs on Linux (see below).

Once the environment is setup, if we want to run the protein optimization task, as in:

python scripts/black_box_opt.py optimizer=lambo optimizer.encoder_obj=mlm task=proxy_rfp tokenizer=protein surrogate=multi_task_exact_gp acquisition=ehvi trial_id=2 at commit 431b052 add LSBO comparison notebook, we run into nan values during the computation - see error below (for completeness I've attached the log of the complete run).

For sake of replicability we run on a Linux system, with the system setup as close to the requirements.txt as possible (I've also attached the environment as linux_env.txt)

Protobuf

TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.
 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

Experiment Run

[2023-09-28 15:07:36,464][root][ERROR] - Expected parameter logits (Tensor of shape (16, 230, 26)) of distribution Categorical(logits: torch.Size([16, 230, 26])) to satisfy the constraint IndependentConstraint(Real(), 1), but found invalid values:
tensor([[[nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         ...,
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan]],

        [[nan, nan, nan,  ..., nan, nan, nan],
         [nan, nan, nan,  ..., nan, nan, nan],

test_run.log linux_env.txt

RMichae1 commented 1 year ago

Given that the commit is quite ahead of the original submission and paper I checked out commit SHA 22afec26da0b9ea1810e65f8a60ea7988c021cef , here the algorithm stages optimizing candidates and querying objective function appear to run correctly, without the previously encountered error in distribution Categorical nan matrix.

Perhaps somewhere between this particular commit and the latest main commit the way the (discrete) MT-GP posterior gets sampled broke?

samuelstanton commented 1 year ago

thanks for the detailed issue. When I was writing the code for this paper the MTGP features of GPyTorch and BoTorch were under active development, which is why the requirements file is pinned to that specific commit. I briefly tried removing the requirement but last I checked it seemed like a PR to one or both of those libraries would be needed. In LaMBO-2 I actually abandoned GPs in favor of partial deep ensembles, and I'm hoping to open-source that code sometime in the nearish future.

https://arxiv.org/abs/2305.20009

samuelstanton commented 8 months ago

Hi @RMichae1 just wanted to follow up and let you know that the open-source alpha release of LaMBO-2 is live :)

https://github.com/prescient-design/cortex

samuelstanton / lambo

Installation and experiment replication [MacOS (M1/ARM)] #12

Protobuf

Experiment Run