bigchem / synthesis

Contains results and data from Augmented Transformer article
33 stars 9 forks source link

Errors encountered while running compare.pl with given dataset #1

Closed catalystforyou closed 3 years ago

catalystforyou commented 3 years ago

When I tried to test compare.pl in a Linux environment, I downloaded the codes and the data, then I operated the code below: $ perl compare.pl patents_test100.csv But the script failed at line 516, it said: patents_test100.csv.can - processed: 0 errors: 0 Can't use an undefined value as an ARRAY reference at compare.pl line 516. I searched for line 516, but still could not figure out why this error happened. I wonder if I missed some parameters to set, or the script has something wrong within it. I'm not so familiar with perl, so I'm eager to seek your help, thank you so much.

HelloJocelynLu commented 3 years ago

A reference would be needed for comparison. You may need to run (something like):

$ perl compare.pl uspto-50k/patents_test100.csv.can uspto-50k/result_patents_test100ff.csv.can 1

Remember to unzip *.xz files first. My outputs:

version 2.03 23/07/2020

using as:
perl compare.pl uspto-50k/patents_test100.csv.can uspto-50k/result_patents_test100ff.csv.can 1 0 0 0 0   0  0 0
perl compare.pl uspto-50k/patents_test100.csv.can uspto-50k/result_patents_test100ff.csv.can best_top=1 isomeric=all largest=no multi=no augmentations=all beams=all canonical=all

uspto-50k/patents_test100.csv.can - processed: 489158 errors: 0
uspto-50k/result_patents_test100ff.csv.can - processed: 489158 errors: 0

uspto-50k/patents_test100.csv.can   mols: 5002 total lines:489159
uspto-50k/result_patents_test100ff.csv.can  mols: 5002 total lines:489159

empty results 1: 2002
/scratch/jl8570/synthesis   all
ERRORS= NO-RES=1 TP=2661 ALL=5001 accuracy=53.2(53.2)% for TOP=1 predictions for 5002 processed molecules
catalystforyou commented 3 years ago

Thank you for providing me the information, I misunderstood your documents, now it operates well.