Open marcossilva opened 2 years ago
I've not encountered that before. Can you try using num_processes=1
?
It leads to
>>> train_set = USPTOCenter('train', num_processes=1)
Preparing train subset of USPTO for reaction center prediction.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/marcos/.local/lib/python3.8/site-packages/dgllife/data/uspto.py", line 661, in __init__
super(USPTOCenter, self).__init__(
File "/home/marcos/.local/lib/python3.8/site-packages/dgllife/data/uspto.py", line 461, in __init__
self.load_reaction_data(path_to_reaction_file, num_processes)
File "/home/marcos/.local/lib/python3.8/site-packages/dgllife/data/uspto.py", line 523, in load_reaction_data
mol, reaction, graph_edits = load_one_reaction(li)
File "/home/marcos/.local/lib/python3.8/site-packages/dgllife/data/uspto.py", line 319, in load_one_reaction
reaction, graph_edits = line.strip("\r\n ").split()
but after cleaning the downloading file from ~/.dgl/
and setting num_processes=1
it worked.
I realized that on the find_reaction_center_train.py
file the default argument for the number of processes if 4
parser.add_argument('-np', '--num-processes', type=int, default=4,
help='Number of processes to use for data pre-processing')
so running the script with the default arguments lead to this error
Thanks. This might be hardware-specific. Perhaps we should change the default value to 1 instead. Could you open a PR to change the default value?
Hi! I'm trying to train the rexgen model in https://github.com/awslabs/dgl-lifesci/tree/master/examples/reaction_prediction/rexgen_direct but while loading the USPTO data I'm getting a pickle problem as can be seen below:
This error doesn't happen while loading val and test sets though. Below is my libs versions: