Problems reproducing results

danielm322 commented 2 years ago

Hello, this is Daniel, student of master of Artificial Intelligence at Paris-Saclay University. This is a great work and thanks for sharing the code, however, the results from the paper cannot be reproduced when executing the train command in the readme. Could you provide the precise command to train the model to obtain the paper results, or provide a pretrained model? Thank you, Daniel

DirectMolecularConfGen commented 2 years ago

What results did you reproduce? Please paste your command and results.

danielm322 commented 2 years ago

Using the small-scale configuration adapted for our GPU (half batch size and half learning rate), and taking directly the data split from confgf repo, for the drugs dataset, this is the namespace of the model: Namespace(aux_loss=0.0, base_path='./dataset/drugs_processed', batch_size=64, beta2=0.999, checkpoint_dir='', cycle=1, data_split='confgf', dataset_name='drugs', decoder_layers=None, decoder_std=1.0, device=0, dropedge_rate=0.1, dropnode_rate=0.1, dropout=0.1, enable_tb=False, encoder_dropout=0.0, epochs=100, eval_from='./checkpoint/checkpoint_35.pt', eval_one=False, extend_edge=True, global_attn=True, global_reducer='sum', graph_pooling='sum', latent_size=256, layernorm_before=False, log_interval=100, lr=0.0001, lr_warmup=True, mlp_hidden_size=512, mlp_layers=2, node_attn=True, node_reducer='sum', noise_level=10, noise_lr=2.4e-06, noise_steps=100, num_layers=3, num_workers=1, output_pair_example=1, period=10, pred_pos_residual=True, prop_pred=False, remove_hs=True, reuse_prior=True, sample_beta=1.0, save_output=False, score=False, score_prior=False, seed=2021, shared_decoder=False, shared_output=True, sigma_begin=10.0, sigma_end=0.01, threshold=0.5, train_size=0.8, train_subset=True, use_adamw=True, use_bn=True, use_ff=True, use_layer_norm=False, vae_beta=1.0, weight_decay=0.01, workers=20)

And these are the results: cov mean 0.8114064312174885 med 0.8665707606545585 mat mean 0.9477826952934265 med 0.9338723421096802

Which clearly differ from you small-scale results on the drugs dataset.

DirectMolecularConfGen commented 2 years ago

Can you paste your training and test command? Just at a glance, I find your test command is different from that in our readme.md, e.g., num_layers and sample_beta. Moreover, checkpoint35 is too early to test model performance. We train for 100 epochs and test all models with "checkpoint_94.pt".

danielm322 commented 2 years ago

I tried another experiment using the auxiliary loss so I trained with the next command: python train.py --dropout 0.1 --use-bn --use-adamw --lr-warmup --enable-tb \ --aux-loss 0 --num-layers 3 --lr 8e-5 --batch-size 64 --vae-beta-min 0.0001 \ --vae-beta-max 0.01 --latent-size 256 --mlp-hidden-size 512 \ --reuse-prior --node-attn --data-split confgf --dataset-name drugs --shared-output \ --pred-pos-residual --base-path ./dataset/drugs_processed --rand-aug \ --checkpoint-dir checkpoint --use-global --global-attn --extend-edge \ --grad-norm 2 --remove-hs --ang-lam 0.1 --bond-lam 0.1

The losses plot can be seen here:

Capture d’écran 2022-04-19 à 09 52 43

Then after inspection of the train, test and validation curves, the global minimum of the test set was at checkpoint 27, the third local minimum was at epoch 87, so I passed the next command:

python evaluate.py --dropout 0.1 --use-bn --lr-warmup --use-adamw \ --latent-size 256 --mlp-hidden-size 512 \ --num-layers 3 --eval-from ./checkpoint/checkpoint_87.pt --workers 20 --batch-size 64 \ --reuse-prior --node-attn --data-split confgf --dataset-name drugs --remove-hs \ --shared-output --pred-pos-residual --sample-beta 1.2 --base-path ./dataset/drugs_processed \ --use-ff --global-attn --extend-edge

With the results: cov mean 0.7548755335070557 med 0.8074606116774792 mat mean 0.9960401058197021 med 0.9822216629981995

At checkpoint 27 the results are: cov mean 0.796436271909601 med 0.8544973544973544 mat mean 0.9489984512329102 med 0.9455164670944214

I repeat that here I tried to reproduce the small scale results, that's why the num_layers is 3. Also I reduced the learning rate since the batch size is also smaller due to GPU capacity. The learning rate is a bit smaller than half of yours (2e-4) since in my last experiment I was under the impression that the learning rate might have been big (1e-4) by looking at loss curves, as it is the case here, there is a plateau approximately after the first 30 epochs in the test set loss.

There are also some parameters for which I couldn't find their correct setting by looking at the paper or the supplementary materials, for example --aux-loss --extend-edge --global-attn --shared-decoder --clamp-dist --sg-pos --grad-norm --use-ss I'm not sure if I'm using the correct configuration for them

Thanks in advance for the help!

danielm322 commented 2 years ago

Some help please? I am interested in extending this method to include other capabilities but first I want to make sure I am using it correctly

zytzrh commented 2 years ago

Some help please? I am interested in extending this method to include other capabilities but first I want to make sure I am using it correctly

Hi Daniel, have you tried other experiment (e.g. large scale)? I'm also working on reproducuing some of their results.

DirectMolecularConfGen commented 2 years ago

@danielm322 @zytzrh Please see our updated README to compare your configurations with our provided logs (specifically, the row started with "Namespace") to reproduce our results.

DirectMolecularConfGen / DMCG

Problems reproducing results #2