Open danielm322 opened 2 years ago
What results did you reproduce? Please paste your command and results.
Using the small-scale configuration adapted for our GPU (half batch size and half learning rate), and taking directly the data split from confgf repo, for the drugs dataset, this is the namespace of the model:
Namespace(aux_loss=0.0, base_path='./dataset/drugs_processed', batch_size=64, beta2=0.999, checkpoint_dir='', cycle=1, data_split='confgf', dataset_name='drugs', decoder_layers=None, decoder_std=1.0, device=0, dropedge_rate=0.1, dropnode_rate=0.1, dropout=0.1, enable_tb=False, encoder_dropout=0.0, epochs=100, eval_from='./checkpoint/checkpoint_35.pt', eval_one=False, extend_edge=True, global_attn=True, global_reducer='sum', graph_pooling='sum', latent_size=256, layernorm_before=False, log_interval=100, lr=0.0001, lr_warmup=True, mlp_hidden_size=512, mlp_layers=2, node_attn=True, node_reducer='sum', noise_level=10, noise_lr=2.4e-06, noise_steps=100, num_layers=3, num_workers=1, output_pair_example=1, period=10, pred_pos_residual=True, prop_pred=False, remove_hs=True, reuse_prior=True, sample_beta=1.0, save_output=False, score=False, score_prior=False, seed=2021, shared_decoder=False, shared_output=True, sigma_begin=10.0, sigma_end=0.01, threshold=0.5, train_size=0.8, train_subset=True, use_adamw=True, use_bn=True, use_ff=True, use_layer_norm=False, vae_beta=1.0, weight_decay=0.01, workers=20)
And these are the results:
cov mean 0.8114064312174885 med 0.8665707606545585 mat mean 0.9477826952934265 med 0.9338723421096802
Which clearly differ from you small-scale results on the drugs dataset.
Can you paste your training and test command? Just at a glance, I find your test command is different from that in our readme.md, e.g., num_layers
and sample_beta
. Moreover, checkpoint35
is too early to test model performance. We train for 100 epochs and test all models with "checkpoint_94.pt".
I tried another experiment using the auxiliary loss so I trained with the next command:
python train.py --dropout 0.1 --use-bn --use-adamw --lr-warmup --enable-tb \ --aux-loss 0 --num-layers 3 --lr 8e-5 --batch-size 64 --vae-beta-min 0.0001 \ --vae-beta-max 0.01 --latent-size 256 --mlp-hidden-size 512 \ --reuse-prior --node-attn --data-split confgf --dataset-name drugs --shared-output \ --pred-pos-residual --base-path ./dataset/drugs_processed --rand-aug \ --checkpoint-dir checkpoint --use-global --global-attn --extend-edge \ --grad-norm 2 --remove-hs --ang-lam 0.1 --bond-lam 0.1
The losses plot can be seen here:
Then after inspection of the train, test and validation curves, the global minimum of the test set was at checkpoint 27, the third local minimum was at epoch 87, so I passed the next command:
python evaluate.py --dropout 0.1 --use-bn --lr-warmup --use-adamw \ --latent-size 256 --mlp-hidden-size 512 \ --num-layers 3 --eval-from ./checkpoint/checkpoint_87.pt --workers 20 --batch-size 64 \ --reuse-prior --node-attn --data-split confgf --dataset-name drugs --remove-hs \ --shared-output --pred-pos-residual --sample-beta 1.2 --base-path ./dataset/drugs_processed \ --use-ff --global-attn --extend-edge
With the results:
cov mean 0.7548755335070557 med 0.8074606116774792 mat mean 0.9960401058197021 med 0.9822216629981995
At checkpoint 27 the results are:
cov mean 0.796436271909601 med 0.8544973544973544 mat mean 0.9489984512329102 med 0.9455164670944214
I repeat that here I tried to reproduce the small scale results, that's why the num_layers
is 3. Also I reduced the learning rate since the batch size is also smaller due to GPU capacity. The learning rate is a bit smaller than half of yours (2e-4) since in my last experiment I was under the impression that the learning rate might have been big (1e-4) by looking at loss curves, as it is the case here, there is a plateau approximately after the first 30 epochs in the test set loss.
There are also some parameters for which I couldn't find their correct setting by looking at the paper or the supplementary materials, for example --aux-loss --extend-edge --global-attn --shared-decoder --clamp-dist --sg-pos --grad-norm --use-ss
I'm not sure if I'm using the correct configuration for them
Thanks in advance for the help!
Some help please? I am interested in extending this method to include other capabilities but first I want to make sure I am using it correctly
Some help please? I am interested in extending this method to include other capabilities but first I want to make sure I am using it correctly
Hi Daniel, have you tried other experiment (e.g. large scale)? I'm also working on reproducuing some of their results.
@danielm322 @zytzrh Please see our updated README to compare your configurations with our provided logs (specifically, the row started with "Namespace") to reproduce our results.
Hello, this is Daniel, student of master of Artificial Intelligence at Paris-Saclay University. This is a great work and thanks for sharing the code, however, the results from the paper cannot be reproduced when executing the train command in the readme. Could you provide the precise command to train the model to obtain the paper results, or provide a pretrained model? Thank you, Daniel