Fix multiprocessing when a large amount of files are being preprocessed.
Fix the CacheDisk to num_worker=1 since there are some racing conditions in the access of the file which can lead to errors.
Add example config for the most recent models and additional explanation.
Add 2 additional chembl dataset, chembl_v2_above_500.csv use in the project and chembl_superlight.csv for fast debugging. We also added the protein splits of chembl_v2_above_500.csv.
Patch Note:
chembl_v2_above_500.csv
use in the project andchembl_superlight.csv
for fast debugging. We also added the protein splits ofchembl_v2_above_500.csv
.