Open LanternsSea opened 4 weeks ago
Do you have an example input where it fails to capture chirality?
Also, do you encode chirality in your smiles input?
Here is the example: I get the smiles from RCSBPDB Isomeric SMILES and have checked that the chirality is right. But the result is wrong.
from pathlib import Path
import numpy as np
import torch
from chai_lab.chai1 import run_inference
# We use fasta-like format for inputs.
# - each entity encodes protein, ligand, RNA or DNA
# - each entity is labeled with unique name;
# - ligands are encoded with SMILES; modified residues encoded like AAA(SEP)AAA
# Example given below, just modify it
example_fasta = """
>protein|name=a
MTETILAAQIEVGEHHTATWLGMTVNTDTVLSTAIAGLIVIALAFYLRAKVTSTDVPGGVQLFFEAITIQM
RNQVESAIGMRIAPFVLPLAVTIFVFILISNWLAVLPVQYTDKHGHTTELLKSAAADINYVLALALFVFVC
YHTAGIWRRGIVGHPIKLLKGHVTLLAPINLVEEVAKPISLSLRLFGNIFAGGILVALIALFPPYIMWAPN
AIWKAFDLFVGAIQAFIFALLTILYFSQAMELEEEHH
>protein|name=c1
DPTIAAGALIGGGLIMAGGAIGAGIGDGVAGNALISGVARQPEAQGRLFTPFFITVGLVEAAYFINLAFM
ALFVFATPV
>protein|name=c2
DPTIAAGALIGGGLIMAGGAIGAGIGDGVAGNALISGVARQPEAQGRLFTPFFITVGLVEAAYFINLAFM
ALFVFATPV
>protein|name=c3
DPTIAAGALIGGGLIMAGGAIGAGIGDGVAGNALISGVARQPEAQGRLFTPFFITVGLVEAAYFINLAFM
ALFVFATPV
>protein|name=c4
DPTIAAGALIGGGLIMAGGAIGAGIGDGVAGNALISGVARQPEAQGRLFTPFFITVGLVEAAYFINLAFM
ALFVFATPV
>protein|name=c5
DPTIAAGALIGGGLIMAGGAIGAGIGDGVAGNALISGVARQPEAQGRLFTPFFITVGLVEAAYFINLAFM
ALFVFATPV
>protein|name=c6
DPTIAAGALIGGGLIMAGGAIGAGIGDGVAGNALISGVARQPEAQGRLFTPFFITVGLVEAAYFINLAFM
ALFVFATPV
>protein|name=c7
DPTIAAGALIGGGLIMAGGAIGAGIGDGVAGNALISGVARQPEAQGRLFTPFFITVGLVEAAYFINLAFM
ALFVFATPV
>protein|name=c8
DPTIAAGALIGGGLIMAGGAIGAGIGDGVAGNALISGVARQPEAQGRLFTPFFITVGLVEAAYFINLAFM
ALFVFATPV
>ligand|name=bdq
CN(C)CC[C@@](c1cccc2c1cccc2)([C@H](c3ccccc3)c4cc5cc(ccc5nc4OC)Br)O
""".strip()
fasta_path = Path("./example_fasta")
fasta_path.write_text(example_fasta)
output_dir = Path("./outputs")
candidates = run_inference(
fasta_file=fasta_path,
output_dir=output_dir,
# 'default' setup
num_trunk_recycles=3,
num_diffn_timesteps=200,
seed=42,
device=torch.device("cuda:0"),
use_esm_embeddings=True,
)
cif_paths = candidates.cif_paths
scores = [rd.aggregate_score for rd in candidates.ranking_data]
# Load pTM, ipTM, pLDDTs and clash scores for sample 2
scores = np.load(output_dir.joinpath("scores.model_idx_2.npz"))
Hi all, could you solve this problem?
Hi, I am impressed by your achievements, it's fantastic.
However, I am encountering a problem. The chirality of the ligand in the results seems to be incorrect. Is there a way to provide ligands information for prediction through an SDF file or another format instead of smile?