Closed mehedihasandesu closed 1 month ago
You basically need to implement a class similar to this and use that dataset
Notice that in record_tokens function we create a unique identifier for each molecule, and in read_data we specify how to get the graph data (nodes, edges, target). Notice how we use the smiles2graph function to convert smiles into graphs.
Then if you subclass them, with mixins like GraphDataset, SVDEncodingsGraphDataset, StructuralDataset, etc. additional features will be automatically attached.
Its working now. Thank you so much.
You are welcome
Hi, I am currently trying to implement your model using my own dataset, which is in a .csv format containing SMILES codes for molecules along with their corresponding HomoLumo values. My goal is to apply your model as it is, following the procedures outlined in the OGB utility scripts. However, I am having difficulty integrating my .csv file into the pipeline, particularly when it comes to data preprocessing, such as splitting the data and performing the necessary conversions and embeddings to train the model. I am a beginner in machine learning, I would greatly appreciate any guidance or suggestions you could provide to help me successfully process my data and implement the model.Thank you very much.