chemosim-lab / ProLIF

Interaction Fingerprints for protein-ligand complexes and more
https://prolif.readthedocs.io
Apache License 2.0
337 stars 66 forks source link

Save interaction atom number #209

Closed chenbq18 closed 5 days ago

chenbq18 commented 5 days ago

before 1.0.0 version maybe, I can use fp.to_dataframe(return_atoms=True) to save interaction atom numbers between ligand and protein, but latest version(i use v2.0.0), the parameter(return_atoms) have been removed. I can see ifp.metadata store all detailed information, but i don't know how to save to dataframe? There are have sample method to save interaction atom numbers? Thank you!

cbouy commented 5 days ago

Hi @chenbq18

You can use the following function to do this:

import pandas as pd
from typing import Literal

def get_atom_indices_dataframe(
    fp: plf.Fingerprint,
    indices_type: Literal["indices", "parent_indices"] = "indices",
    all_indices: bool = None,
):
    if all_indices is None:
        all_indices = fp.count

    if all_indices:

        def get_indices(metadata_tuple, moltype):
            return tuple(
                [metadata[indices_type][moltype] for metadata in metadata_tuple]
            )

    else:

        def get_indices(metadata_tuple, moltype):
            return metadata_tuple[0][indices_type][moltype]

    data = []
    index = []
    for i, frame_ifp in fp.ifp.items():
        index.append(i)
        frame_data = {}
        for (ligres, protres), residue_ifp in frame_ifp.items():
            for int_name, metadata_tuple in residue_ifp.items():
                for moltype in ("ligand", "protein"):
                    frame_data[(str(ligres), str(protres), int_name, moltype)] = (
                        get_indices(metadata_tuple, moltype)
                    )
        data.append(frame_data)

    df = pd.DataFrame(data, index=pd.Index(index, name="Frame"))
    df.columns = pd.MultiIndex.from_tuples(
        df.columns, names=["ligand", "protein", "interaction", "indices"]
    )
    return df.sort_index(
        axis=1,
        level=1,
        key=lambda index: [plf.ResidueId.from_string(x) for x in index],
    )

get_atom_indices_dataframe(fp, indices_type="indices", all_indices=False)
image

Feel free to reopen this issue if you have other related questions