data processing - Githubissues

diogeneshezekiah commented 4 days ago

Hi, I couldn't understand how you extracted the data from the article and processed them into JMC_data.csv, how did you do that? Moreover may I ask you how could I build my own database for training? thank you

wzhstat commented 3 days ago

Hi! Thank you for your attention. We didn't write an algorithm to automatically fetch data from articles. We are getting the SMILES of the reactions and the corresponding reaction conditions from Reaxys. For dataset construction we should first obtain SMILES for reactions and reaction conditions. if the dataset itself does not contain mappings we will implement the mapping of the reactions via rxnmapper (https://github.com/rxn4chemistry/rxnmapper). Then we would extract the templates via the template_extractor module in RDchrial (https://github.com/connorcoley/rdchiral/blob/master/rdchiral/template_extractor.py).

diogeneshezekiah commented 3 days ago

thank you for the support, I'll let you know if it works : )

wzhstat / Reaction-Condition-Selector

data processing #2