biomed-AI / DiffDec

MIT License
28 stars 5 forks source link

question about the data/examples/protein.pdb #4

Open MachineGUN001 opened 9 months ago

MachineGUN001 commented 9 months ago

hi,

thank you to provide the example to implement the sampling for single/mutiple Rgroups

as for the specific protein, I check the pdb file in data/examples/protein.pdb, and this pdb file is a pocket for the targe scaffold.

how did you generate this pocket pdb file?

many thanks

ShY

MachineGUN001 commented 9 months ago

test.zip

Based on the case you provided, I tried to use other proteins (PDB ID:6nsl) as well as using a SINGLE scaffold method via sample_single_for_specific_context.py. But there is an error reported, is it a problem with my protein pocket handling method? I used PyMol to generate the pocket consisting of residues of 8A around the ligand and output as pdb file. pls see the attached files I used.

the error is below

Preprocessing dataset with prefix exp_6nsl_test
c:\.conda\envs\diffhopp\lib\site-packages\Bio\PDB\PDBParser.py:395: PDBConstructionWarning: Ignoring unrecognized record 'END' at line 380
  warnings.warn(
Traceback (most recent call last):
  File "e:\DiffDec-master\sample_single_for_specific_context.py", line 409, in <module>
    sample(args.checkpoint, args.samples_dir, args.data_dir, args.n_samples, args.task_name, args.device)
  File "e:\DiffDec-master\sample_single_for_specific_context.py", line 315, in sample
    model.setup(stage='val')
  File "e:\DiffDec-master\src\model_single.py", line 119, in setup
    self.val_dataset = dataset_type(
  File "e:\DiffDec-master\src\datasets.py", line 285, in __init__
    self.data = CrossDockDataset.preprocess(data_path, prefix, pocket_mode, device)
  File "e:\DiffDec-master\src\datasets.py", line 305, in preprocess
    table = pd.read_csv(table_path)
  File "c:\.conda\envs\diffhopp\lib\site-packages\pandas\util\_decorators.py", line 211, in wrapper
    return func(*args, **kwargs)
  File "c:\.conda\envs\diffhopp\lib\site-packages\pandas\util\_decorators.py", line 331, in wrapper
    return func(*args, **kwargs)
  File "c:\.conda\envs\diffhopp\lib\site-packages\pandas\io\parsers\readers.py", line 950, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "c:\.conda\envs\diffhopp\lib\site-packages\pandas\io\parsers\readers.py", line 605, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "c:\.conda\envs\diffhopp\lib\site-packages\pandas\io\parsers\readers.py", line 1442, in __init__
    self._engine = self._make_engine(f, self.engine)
  File "c:\.conda\envs\diffhopp\lib\site-packages\pandas\io\parsers\readers.py", line 1753, in _make_engine
    return mapping[engine](f, **self.options)
  File "c:\.conda\envs\diffhopp\lib\site-packages\pandas\io\parsers\c_parser_wrapper.py", line 79, in __init__
    self._reader = parsers.TextReader(src, **kwds)
  File "pandas\_libs\parsers.pyx", line 554, in pandas._libs.parsers.TextReader.__cinit__
pandas.errors.EmptyDataError: No columns to parse from file

many thanks for your suggestions.

MachineGUN001 commented 9 months ago

I found the problem is in the scaf sdf file format does not match. I generated the 3d sdf file but the formatting is wrong. May I ask how you are handling the scaf sdf?