Open ardagoreci opened 8 months ago
Could you please attach an example file that this fails on or say its pdb id @ardagoreci ?
This code ran for me without any errors.
from proteinflow.data import ProteinEntry
from tqdm import tqdm
import os
folder = "data/proteinflow_20230102_stable/train"
for filename in tqdm(os.listdir(folder)):
ProteinEntry.from_pickle(os.path.join(folder, filename)).to_pdb("tmp.pdb")
Hi Liza,
I noticed when trying to create a W&B table visualization for the entire dataset that converting the pickle files into pdbs reveals multiple bugs.
Firstly, I got a "UnpicklingError: unpickling stack underflow" from the line "protein_entry = ProteinEntry.from_pickle(pickle_path)" It did not happen with every protein, so when I handled that exception I realized that PDBParser could not properly parse a few of the generate pdb files, throwing out an error in the line "structure = parser.get_structure(pdb_id, target_path)"