open-mol / bioagent

Apache License 2.0
0 stars 0 forks source link

Add Specialist Performance #14

Open CiaoHe opened 1 month ago

CiaoHe commented 1 month ago

yields

我们的dataset包含Buchwald-Hartwig (BH) 以及 Suzuki-Miyaura (SM) 我们的splitting方式:follow ChemLLMBench,从BH和SM中各random select 100 samples, use regression as task-objective (not binary classification), 并且normalize yield value to [0,1]. BH: train 3,955; test: 100 SM: train 5,760; test: 100

Foward & Retro

Reaction classification

USPTO_TPL_1K

Reagent Selection

Follow ChemLLMBench, and formulate reaction component selection tasks from the Suzuki High-Throughput Experimentation (HTE) dataset. Evaluate the Suzuki coupling of 5 electrophiles and 7 nucleophiles across a matrix of 11 ligands (with one blank), 7 bases (with one blank), and 4 solvents. Three components:

Reaction Component Prediction

Data from TextReact and Mol-Instruction Reagent Prediction. 因为TextReact整体数据量过大,为了平衡我们使用MolIns-Reagent-Prediction Part dataset, 根据TextReact提供的关于Reaction Condition的信息将原数据集拆分为reagent prediction, catalyst prediction和solvent prediction

Given canonical_rxn. Reaction Condition include: catalyst1, solvent1, solvent2, reagent1, reagent2

super-dainiu commented 2 weeks ago

T5Chem issue: https://github.com/HelloJocelynLu/t5chem/issues/18#issuecomment-2168363330