Using backend: pytorch
/home/glard/anaconda3/envs/self/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/glard/anaconda3/envs/self/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/glard/anaconda3/envs/self/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/glard/anaconda3/envs/self/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/glard/anaconda3/envs/self/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/glard/anaconda3/envs/self/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
/home/glard/anaconda3/envs/self/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/glard/anaconda3/envs/self/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/glard/anaconda3/envs/self/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/glard/anaconda3/envs/self/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/glard/anaconda3/envs/self/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/glard/anaconda3/envs/self/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
WARNING:tensorflow:From /home/glard/doping/dl4chem-mgm/src/model/graph_generator.py:19: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.
WARNING:tensorflow:From /home/glard/doping/dl4chem-mgm/src/model/graph_generator.py:21: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.
2022-01-18 13:35:06.430977: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2022-01-18 13:35:06.451785: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3699850000 Hz
2022-01-18 13:35:06.452153: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x558cb0b482d0 executing computations on platform Host. Devices:
2022-01-18 13:35:06.452164: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): ,
2022-01-18 13:35:06.452267: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2022-01-18 13:35:06.462302: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-01-18 13:35:06.462441: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: NVIDIA GeForce RTX 3070 major: 8 minor: 6 memoryClockRate(GHz): 1.815
pciBusID: 0000:01:00.0
2022-01-18 13:35:06.462471: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2022-01-18 13:35:06.462490: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2022-01-18 13:35:06.462505: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2022-01-18 13:35:06.462519: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2022-01-18 13:35:06.683946: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2022-01-18 13:35:06.684160: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2022-01-18 13:35:07.202372: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2022-01-18 13:35:07.202642: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-01-18 13:35:07.203193: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-01-18 13:35:07.203657: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2022-01-18 13:43:36.841675: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-01-18 13:43:36.841695: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
2022-01-18 13:43:36.841703: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
2022-01-18 13:43:36.841802: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-01-18 13:43:36.841928: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-01-18 13:43:36.842032: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-01-18 13:43:36.842127: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6878 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3070, pci bus id: 0000:01:00.0, compute capability: 8.6)
2022-01-18 13:43:36.843210: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x558cd6a94070 executing computations on platform CUDA. Devices:
2022-01-18 13:43:36.843222: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): NVIDIA GeForce RTX 3070, Compute Capability 8.6
INFO - 01/18/22 13:43:37 - 0:00:00 - ============ Initialized logger ============
INFO - 01/18/22 13:43:37 - 0:00:00 - Random seed is 0
INFO - 01/18/22 13:43:37 - 0:00:00 - ar: False
batch_size: 16
binary_classification: False
bound_edges: False
check_pred_validity: False
clip_grad_norm: 10.0
cond_virtual_node: False
data_path: data/QM9/QM9_processed.p
debug_fixed: False
debug_small: False
decay_start_iter: 99999999
dim_h: 2048
dim_k: 1
do_not_corrupt: False
dump_path: dumped/
edge_mask_frac: 1.0
edge_mask_predict_frac: 1.0
edge_replace_frac: 0.0
edge_replace_predict_frac: 1.0
edge_target_frac: 0.2
edges_per_batch: -1
embed_hs: False
equalise: False
exp_id:
exp_name: QM9_experiment
first_iter: 0
force_mask_predict: True
force_replace_predict: False
fully_connected: False
gen_num_iters: 10
gen_num_samples: 0
gen_predict_deterministically: False
gen_random_init: False
global_connection: False
grad_accum_iters: 1
graph2binary_properties_path: data/proteins/pdb_golabels.p
graph_properties_path:
graph_property_names: []
graph_type: QM9
layer_norm: True
load_best: False
load_latest: False
local_cpu: False
log_train_steps: 200
loss_normalisation_type: by_component
lr_decay_amount: 0.0
lr_decay_frac: 1.0
lr_decay_interval: 9999999
mask_all_ring_properties: False
mask_independently: True
mat_N: 2
mat_d_model: 64
mat_dropout: 0.1
mat_h: 8
max_charge: 1
max_epoch: 100000
max_hs: 4
max_nodes: 9
max_steps: 10000000.0
max_target_frac: 0.8
min_charge: -1
min_lr: 0.0
model_name: GraphNN
mpnn_name: EdgesFromNodesMPNN
mpnn_steps: 4
no_edge_present_type: zeros
no_save: False
no_update: False
node_mask_frac: 1.0
node_mask_predict_frac: 1.0
node_mpnn_name: NbrEWMultMPNN
node_replace_frac: 0.0
node_replace_predict_frac: 1.0
node_target_frac: 0.2
normalise_graph_properties: False
num_batches: 4
num_binary_graph_properties: 0
num_edge_types: 5
num_epochs: 200
num_graph_properties: 0
num_mpnns: 1
num_node_types: 5
optimizer: adam,lr=0.0001
perturbation_batch_size: 32
perturbation_edges_per_batch: -1
predict_graph_properties: False
prediction_data_structs: all
pretrained_property_embeddings_path: data/proteins/preprocessed_go_embeddings.npy
property_type: None
res_conn: False
save_all: False
seed: 0
seq_output_dim: 768
share_embed: False
shuffle: True
smiles_path: None
smiles_train_split: 0.8
spatial_msg_res_conn: True
spatial_postgru_res_conn: False
suppress_params: False
suppress_train_log: False
target_data_structs: both
target_frac_inc_after: None
target_frac_inc_amount: 0
target_frac_type: random
tensorboard: True
update_edges_at_end_only: False
use_newest_edges: False
use_smiles: False
val_after: 105
val_batch_size: 2500
val_data_path: data/ChEMBL/ChEMBL_val_processed_hs.p
val_dataset_size: -1
val_edge_target_frac: 0.1
val_edges_per_batch: None
val_graph2binary_properties_path: None
val_graph_properties_path: data/ChEMBL/ChEMBL_val_graph_properties.p
val_node_target_frac: 0.1
val_seed: 0
validate_on_train: False
warm_up_iters: 1.0
weighted_loss: False
INFO - 01/18/22 13:43:37 - 0:00:00 - Running command: python train.py --data_path data/QM9/QM9_processed.p --graph_type QM9 --exp_name QM9_experiment --num_node_types 5 --num_edge_types 5 --max_nodes 9 --layer_norm --spatial_msg_res_conn --batch_size 16 --val_batch_size 2500 --val_after 105 --num_epochs 200 --shuffle --mask_independently --force_mask_predict --optimizer adam,lr=0.0001 --tensorboard
INFO - 01/18/22 13:43:37 - 0:00:00 - The experiment will be stored in dumped/QM9_experiment
INFO - 01/18/22 13:43:43 - 0:00:06 - train_loader len is 6651
INFO - 01/18/22 13:43:43 - 0:00:06 - val_loader len is 11
Starting epoch 1
0
Traceback (most recent call last):
File "train.py", line 322, in
main(params)
File "train.py", line 129, in main
binary_graph_properties)
File "/home/glard/anaconda3/envs/self/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, kwargs)
File "/home/glard/doping/dl4chem-mgm/src/model/gnn.py", line 186, in forward
batch_init_graph = self.mpnnsmpnn_num
File "/home/glard/anaconda3/envs/self/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, *kwargs)
File "/home/glard/doping/dl4chem-mgm/src/model/mpnns.py", line 56, in forward
updated_nodes, updated_edges = self.mpnn_step_forward(batch_graph, step_num)
File "/home/glard/doping/dl4chem-mgm/src/model/mpnns.py", line 77, in mpnn_step_forward_nonfc
updated_nodes = self.node_mpnn(batch_graph)
File "/home/glard/anaconda3/envs/self/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(input, kwargs)
File "/home/glard/doping/dl4chem-mgm/src/model/node_mpnns.py", line 36, in forward
nodes = self.update_GRU(msg, g.ndata['nodes'])
File "/home/glard/doping/dl4chem-mgm/src/model/node_mpnns.py", line 23, in updateGRU
, node_next = self.gru(msg, node)
File "/home/glard/anaconda3/envs/self/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/glard/anaconda3/envs/self/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 716, in forward
self.dropout, self.training, self.bidirectional, self.batch_first)
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED
Using backend: pytorch /home/glard/anaconda3/envs/self/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/glard/anaconda3/envs/self/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/glard/anaconda3/envs/self/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/glard/anaconda3/envs/self/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/glard/anaconda3/envs/self/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/glard/anaconda3/envs/self/lib/python3.7/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) /home/glard/anaconda3/envs/self/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/glard/anaconda3/envs/self/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/glard/anaconda3/envs/self/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/glard/anaconda3/envs/self/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/glard/anaconda3/envs/self/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/glard/anaconda3/envs/self/lib/python3.7/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) WARNING:tensorflow:From /home/glard/doping/dl4chem-mgm/src/model/graph_generator.py:19: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.
WARNING:tensorflow:From /home/glard/doping/dl4chem-mgm/src/model/graph_generator.py:21: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.
2022-01-18 13:35:06.430977: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA 2022-01-18 13:35:06.451785: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3699850000 Hz 2022-01-18 13:35:06.452153: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x558cb0b482d0 executing computations on platform Host. Devices: 2022-01-18 13:35:06.452164: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0):,
2022-01-18 13:35:06.452267: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2022-01-18 13:35:06.462302: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-01-18 13:35:06.462441: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: NVIDIA GeForce RTX 3070 major: 8 minor: 6 memoryClockRate(GHz): 1.815
pciBusID: 0000:01:00.0
2022-01-18 13:35:06.462471: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2022-01-18 13:35:06.462490: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2022-01-18 13:35:06.462505: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2022-01-18 13:35:06.462519: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2022-01-18 13:35:06.683946: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2022-01-18 13:35:06.684160: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2022-01-18 13:35:07.202372: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2022-01-18 13:35:07.202642: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-01-18 13:35:07.203193: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-01-18 13:35:07.203657: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2022-01-18 13:43:36.841675: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-01-18 13:43:36.841695: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
2022-01-18 13:43:36.841703: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
2022-01-18 13:43:36.841802: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-01-18 13:43:36.841928: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-01-18 13:43:36.842032: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-01-18 13:43:36.842127: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6878 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3070, pci bus id: 0000:01:00.0, compute capability: 8.6)
2022-01-18 13:43:36.843210: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x558cd6a94070 executing computations on platform CUDA. Devices:
2022-01-18 13:43:36.843222: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): NVIDIA GeForce RTX 3070, Compute Capability 8.6
INFO - 01/18/22 13:43:37 - 0:00:00 - ============ Initialized logger ============
INFO - 01/18/22 13:43:37 - 0:00:00 - Random seed is 0
INFO - 01/18/22 13:43:37 - 0:00:00 - ar: False
batch_size: 16
binary_classification: False
bound_edges: False
check_pred_validity: False
clip_grad_norm: 10.0
cond_virtual_node: False
data_path: data/QM9/QM9_processed.p
debug_fixed: False
debug_small: False
decay_start_iter: 99999999
dim_h: 2048
dim_k: 1
do_not_corrupt: False
dump_path: dumped/
edge_mask_frac: 1.0
edge_mask_predict_frac: 1.0
edge_replace_frac: 0.0
edge_replace_predict_frac: 1.0
edge_target_frac: 0.2
edges_per_batch: -1
embed_hs: False
equalise: False
exp_id:
exp_name: QM9_experiment
first_iter: 0
force_mask_predict: True
force_replace_predict: False
fully_connected: False
gen_num_iters: 10
gen_num_samples: 0
gen_predict_deterministically: False
gen_random_init: False
global_connection: False
grad_accum_iters: 1
graph2binary_properties_path: data/proteins/pdb_golabels.p
graph_properties_path:
graph_property_names: []
graph_type: QM9
layer_norm: True
load_best: False
load_latest: False
local_cpu: False
log_train_steps: 200
loss_normalisation_type: by_component
lr_decay_amount: 0.0
lr_decay_frac: 1.0
lr_decay_interval: 9999999
mask_all_ring_properties: False
mask_independently: True
mat_N: 2
mat_d_model: 64
mat_dropout: 0.1
mat_h: 8
max_charge: 1
max_epoch: 100000
max_hs: 4
max_nodes: 9
max_steps: 10000000.0
max_target_frac: 0.8
min_charge: -1
min_lr: 0.0
model_name: GraphNN
mpnn_name: EdgesFromNodesMPNN
mpnn_steps: 4
no_edge_present_type: zeros
no_save: False
no_update: False
node_mask_frac: 1.0
node_mask_predict_frac: 1.0
node_mpnn_name: NbrEWMultMPNN
node_replace_frac: 0.0
node_replace_predict_frac: 1.0
node_target_frac: 0.2
normalise_graph_properties: False
num_batches: 4
num_binary_graph_properties: 0
num_edge_types: 5
num_epochs: 200
num_graph_properties: 0
num_mpnns: 1
num_node_types: 5
optimizer: adam,lr=0.0001
perturbation_batch_size: 32
perturbation_edges_per_batch: -1
predict_graph_properties: False
prediction_data_structs: all
pretrained_property_embeddings_path: data/proteins/preprocessed_go_embeddings.npy
property_type: None
res_conn: False
save_all: False
seed: 0
seq_output_dim: 768
share_embed: False
shuffle: True
smiles_path: None
smiles_train_split: 0.8
spatial_msg_res_conn: True
spatial_postgru_res_conn: False
suppress_params: False
suppress_train_log: False
target_data_structs: both
target_frac_inc_after: None
target_frac_inc_amount: 0
target_frac_type: random
tensorboard: True
update_edges_at_end_only: False
use_newest_edges: False
use_smiles: False
val_after: 105
val_batch_size: 2500
val_data_path: data/ChEMBL/ChEMBL_val_processed_hs.p
val_dataset_size: -1
val_edge_target_frac: 0.1
val_edges_per_batch: None
val_graph2binary_properties_path: None
val_graph_properties_path: data/ChEMBL/ChEMBL_val_graph_properties.p
val_node_target_frac: 0.1
val_seed: 0
validate_on_train: False
warm_up_iters: 1.0
weighted_loss: False
INFO - 01/18/22 13:43:37 - 0:00:00 - Running command: python train.py --data_path data/QM9/QM9_processed.p --graph_type QM9 --exp_name QM9_experiment --num_node_types 5 --num_edge_types 5 --max_nodes 9 --layer_norm --spatial_msg_res_conn --batch_size 16 --val_batch_size 2500 --val_after 105 --num_epochs 200 --shuffle --mask_independently --force_mask_predict --optimizer adam,lr=0.0001 --tensorboard
INFO - 01/18/22 13:43:37 - 0:00:00 - The experiment will be stored in dumped/QM9_experiment
INFO - 01/18/22 13:43:43 - 0:00:06 - train_loader len is 6651 INFO - 01/18/22 13:43:43 - 0:00:06 - val_loader len is 11 Starting epoch 1 0 Traceback (most recent call last): File "train.py", line 322, in
main(params)
File "train.py", line 129, in main
binary_graph_properties)
File "/home/glard/anaconda3/envs/self/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, kwargs)
File "/home/glard/doping/dl4chem-mgm/src/model/gnn.py", line 186, in forward
batch_init_graph = self.mpnnsmpnn_num
File "/home/glard/anaconda3/envs/self/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, *kwargs)
File "/home/glard/doping/dl4chem-mgm/src/model/mpnns.py", line 56, in forward
updated_nodes, updated_edges = self.mpnn_step_forward(batch_graph, step_num)
File "/home/glard/doping/dl4chem-mgm/src/model/mpnns.py", line 77, in mpnn_step_forward_nonfc
updated_nodes = self.node_mpnn(batch_graph)
File "/home/glard/anaconda3/envs/self/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(input, kwargs)
File "/home/glard/doping/dl4chem-mgm/src/model/node_mpnns.py", line 36, in forward
nodes = self.update_GRU(msg, g.ndata['nodes'])
File "/home/glard/doping/dl4chem-mgm/src/model/node_mpnns.py", line 23, in updateGRU
, node_next = self.gru(msg, node)
File "/home/glard/anaconda3/envs/self/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "/home/glard/anaconda3/envs/self/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 716, in forward
self.dropout, self.training, self.bidirectional, self.batch_first)
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED