Minys233 / Dynaformer

MIT License
26 stars 1 forks source link

Public access not permitted to access pdbbind files #3

Closed osession closed 1 year ago

osession commented 1 year ago

Hi,

I'm trying to run the run_evaluate.sh script, but I'm running into this error: urllib.error.HTTPError: HTTP Error 409: Public access is not permitted on this storage account which is happening in this script: Dynaformer/dynaformer/data/pyg_datasets/pyg_dataset_lookup_table.py

Unable to access this url: https://ml2md.blob.core.windows.net/yaosen-data/datasets/pdbbind/%7B%7D.zip

Is there another location where I can access the pdbbind files that are used for this model?

Minys233 commented 1 year ago

%7B%7D.zip is {}.zip after url decoding. In this pyg_dataset_lookup_table.py script, we do a file name formatting based on the dataset name in the parameter. (at this line)

The available urls are listed below, which means the dataset parameter can be pdbbind:set_name=filenames_below_without_extension,cutoffs=5-5-5,seed=0. But for evaluation, you only use coreset 2013 and coreset 2016.

https://ml2md.blob.core.windows.net/yaosen-data/datasets/pdbbind/general-set-2019-coreset-2013.zip
https://ml2md.blob.core.windows.net/yaosen-data/datasets/pdbbind/general-set-2019-coreset-2016.zip
https://ml2md.blob.core.windows.net/yaosen-data/datasets/pdbbind/general-set-2020-coreset-2013.zip
https://ml2md.blob.core.windows.net/yaosen-data/datasets/pdbbind/general-set-2020-coreset-2016.zip
https://ml2md.blob.core.windows.net/yaosen-data/datasets/pdbbind/refined-set-2019-coreset-2013.zip
https://ml2md.blob.core.windows.net/yaosen-data/datasets/pdbbind/refined-set-2019-coreset-2016.zip
https://ml2md.blob.core.windows.net/yaosen-data/datasets/pdbbind/refined-set-2020-coreset-2013.zip
https://ml2md.blob.core.windows.net/yaosen-data/datasets/pdbbind/refined-set-2020-coreset-2016.zip

It's strange since by default, simply run run_evaluate.sh will invoke bash Dynaformer/examples/evaluate/evaluate.sh, and in this script, the default parameters should be used to run evaluate.py. Also the parameters should be printed out in the terminal:

https://github.com/Minys233/dynaformer_model/blob/c9942c389e545a5f43f0834031ce36034cb9b343/examples/evaluate/evaluate.sh#L3-L10

This will log something like this:

echo Running path/to/evaluate.py
echo Finding checkpoint files in: /path/to/checkoint/dir
echo Using dataset: dataset name
echo Save downloaded data to /path/to/dataset/save/dir
echo Save csv with suffix: output csv suffix

Maybe you have to check the logs and the values of parameters in the Dynaformer/examples/evaluate/evaluate.sh. I will make another try and if it still not work for you, please paste the logs and error messages.

PS: The evaluation results on CASF2013 and CASF2016 of top 3 models are also stored at the Dynaformer-D&R.zip which you can downloaded using the link in the README.md. If you just want results for comparison, you can consider just use them.

osession commented 1 year ago

I think that the script is getting the correct links to the dataset files, however the files are not able to be viewed. When I copy and paste each of those URLS you sent into a browser (https://ml2md.blob.core.windows.net/yaosen-data/datasets/pdbbind/general-set-2019-coreset-2013.zip), I get a page with this message:

This XML file does not appear to have any style information associated with it. The document tree is shown below.

PublicAccessNotPermitted Public access is not permitted on this storage account. RequestId:b61e2390-101e-005d-173a-b3a99d000000 Time:2023-07-10T14:28:19.3720302Z
osession commented 1 year ago

Here is the full output of running ./run_evaluate.sh

Running /home/ray/default/Dynaformer/examples/evaluate/../../dynaformer/evaluate/evaluate.py Finding checkpoint files in: /home/ray/default/checkpoint Using dataset: pdbbind:set_name=refined-set-2019-coreset-2016,cutoffs=5-5-5,seed=0 Save downloaded data to /home/ray/default/data Save csv with suffix: _CASF2016 2023-07-10 07:37:21 | WARNING | root | The OGB package is out of date. Your version is 1.3.2, while the latest version is 1.3.6. DGL backend not selected or invalid. Assuming PyTorch for now. Setting the default backend to "pytorch". You can change it in the ~/.dgl/config.json file or export the DGLBACKEND environment variable. Valid options are: pytorch, mxnet, tensorflow (all lowercase) Using backend: pytorch Root at /home/ray/default/data Downloading https://ml2md.blob.core.windows.net/yaosen-data/datasets/pdbbind/refined-set-2019-coreset-2016.zip Traceback (most recent call last): File "/home/ray/default/Dynaformer/examples/evaluate/../../dynaformer/evaluate/evaluate.py", line 149, in main() File "/home/ray/default/Dynaformer/examples/evaluate/../../dynaformer/evaluate/evaluate.py", line 139, in main task = tasks.setup_task(cfg.task) File "/home/ray/anaconda3/lib/python3.9/site-packages/fairseq/tasks/init.py", line 46, in setup_task return task.setup_task(cfg, *kwargs) File "/home/ray/default/Dynaformer/dynaformer/tasks/graph_prediction.py", line 185, in setup_task return cls(cfg) File "/home/ray/default/Dynaformer/dynaformer/tasks/graph_prediction.py", line 284, in init super().init(cfg) File "/home/ray/default/Dynaformer/dynaformer/tasks/graph_prediction.py", line 160, in init self.dm = GraphormerDataset( File "/home/ray/default/Dynaformer/dynaformer/data/dataset.py", line 79, in init self.dataset = PYGDatasetLookupTable.GetPYGDataset(dataset_spec, data_path=data_path, seed=seed) File "/home/ray/default/Dynaformer/dynaformer/data/pyg_datasets/pyg_dataset_lookup_table.py", line 275, in GetPYGDataset train_set = pdbbind_helper(root, set_name=set_name, cutoffs=cutoffs, split="train", seed=seed) File "/home/ray/default/Dynaformer/dynaformer/data/pyg_datasets/pyg_dataset_lookup_table.py", line 183, in pdbbind_helper return PDBBind(root, set_name, args, *kwargs) File "/home/ray/default/Dynaformer/dynaformer/data/pyg_datasets/pyg_dataset_lookup_table.py", line 109, in init super().init(root, transform, pre_transform, pre_filter) File "/home/ray/anaconda3/lib/python3.9/site-packages/torch_geometric/data/in_memory_dataset.py", line 60, in init super().init(root, transform, pre_transform, pre_filter) File "/home/ray/anaconda3/lib/python3.9/site-packages/torch_geometric/data/dataset.py", line 83, in init self._download() File "/home/ray/anaconda3/lib/python3.9/site-packages/torch_geometric/data/dataset.py", line 140, in _download self.download() File "/home/ray/default/Dynaformer/dynaformer/data/pyg_datasets/pyg_dataset_lookup_table.py", line 142, in download path = download_url(self.url.format(self.set_name), self.root) File "/home/ray/anaconda3/lib/python3.9/site-packages/torch_geometric/data/download.py", line 34, in download_url data = urllib.request.urlopen(url, context=context) File "/home/ray/anaconda3/lib/python3.9/urllib/request.py", line 214, in urlopen return opener.open(url, data, timeout) File "/home/ray/anaconda3/lib/python3.9/urllib/request.py", line 523, in open response = meth(req, response) File "/home/ray/anaconda3/lib/python3.9/urllib/request.py", line 632, in http_response response = self.parent.error( File "/home/ray/anaconda3/lib/python3.9/urllib/request.py", line 561, in error return self._call_chain(args) File "/home/ray/anaconda3/lib/python3.9/urllib/request.py", line 494, in _call_chain result = func(*args) File "/home/ray/anaconda3/lib/python3.9/urllib/request.py", line 641, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 409: Public access is not permitted on this storage account.

Minys233 commented 1 year ago

Thank you! After testing, I found that the original azure blob storage permissions have been changed to private. I don't know why, and microsoft's Graphormer repo is also affected, all models are not downloadable......

I have transferred the data files to my own azure blob, the links now should be:

https://scientificdata.blob.core.windows.net/dynaformer/dataset/pdbbind/general-set-2019-coreset-2013.zip
https://scientificdata.blob.core.windows.net/dynaformer/dataset/pdbbind/general-set-2019-coreset-2016.zip
https://scientificdata.blob.core.windows.net/dynaformer/dataset/pdbbind/general-set-2020-coreset-2013.zip
https://scientificdata.blob.core.windows.net/dynaformer/dataset/pdbbind/general-set-2020-coreset-2016.zip
https://scientificdata.blob.core.windows.net/dynaformer/dataset/pdbbind/refined-set-2019-coreset-2013.zip
https://scientificdata.blob.core.windows.net/dynaformer/dataset/pdbbind/refined-set-2019-coreset-2016.zip
https://scientificdata.blob.core.windows.net/dynaformer/dataset/pdbbind/refined-set-2020-coreset-2013.zip
https://scientificdata.blob.core.windows.net/dynaformer/dataset/pdbbind/refined-set-2020-coreset-2016.zip

I have modified the code and changed the related files, including:

Dynafomer/Dynaformer/install.sh: fixed an AttributeError due to numpy version mismatch. Dynafomer/Dynaformer/dynaformer/data/pyg_datasets/pyg_dataset_lookup_table.py: changed the urls to my blob storage. run_evaluate.sh: clearly specified the arguments, and add the evaluation for CASF-2013.

Now, follow the steps in README.md, I will get the following logs running bash run_evaluate.sh:

bash run_evaluate.sh
Running /home/yaosen/Dynafomer/Dynaformer/examples/evaluate/../../dynaformer/evaluate/evaluate.py
Finding checkpoint files in: /home/yaosen/Dynafomer/checkpoint
Using dataset: pdbbind:set_name=refined-set-2019-coreset-2016,cutoffs=5-5-5,seed=0
Save downloaded data to /home/yaosen/Dynafomer/data
Save csv with suffix: _CASF2016
2023-07-11 16:02:53 | WARNING | root | The OGB package is out of date. Your version is 1.3.2, while the latest version is 1.3.6.
Using backend: pytorch
Root at /home/yaosen/Dynafomer/data
2023-07-11 16:02:55 | INFO | dynaformer.models.graphormer | (Full argument list)
2023-07-11 16:02:55 | INFO | __main__ | evaluating checkpoint file /home/yaosen/Dynafomer/checkpoint/model2.pt
2023-07-11 16:03:00 | INFO | dynaformer.tasks.graph_prediction | Loaded test with #samples: 285
285 285 285 285
save results to /home/yaosen/Dynafomer/checkpoint/model2_CASF2016.csv
2023-07-11 16:03:07 | INFO | __main__ | pearson_r: 0.8588312268257141
2023-07-11 16:03:07 | INFO | __main__ | r2: 0.736440122127533
2023-07-11 16:03:07 | INFO | __main__ | mae: 0.8379938493695176
2023-07-11 16:03:07 | INFO | __main__ | mse: 1.2416719198226929
2023-07-11 16:03:07 | INFO | __main__ | mape: 0.15006107091903687
2023-07-11 16:03:07 | INFO | __main__ | smape: 0.13852134346961975
2023-07-11 16:03:07 | INFO | __main__ | evaluating checkpoint file /home/yaosen/Dynafomer/checkpoint/model1.pt
2023-07-11 16:03:08 | INFO | dynaformer.tasks.graph_prediction | Loaded test with #samples: 285
285 285 285 285
save results to /home/yaosen/Dynafomer/checkpoint/model1_CASF2016.csv
2023-07-11 16:03:15 | INFO | __main__ | pearson_r: 0.851351261138916
2023-07-11 16:03:15 | INFO | __main__ | r2: 0.7237889766693115
2023-07-11 16:03:15 | INFO | __main__ | mae: 0.8589531823208457
2023-07-11 16:03:15 | INFO | __main__ | mse: 1.301273226737976
2023-07-11 16:03:15 | INFO | __main__ | mape: 0.15483413636684418
2023-07-11 16:03:15 | INFO | __main__ | smape: 0.1410219669342041
2023-07-11 16:03:15 | INFO | __main__ | evaluating checkpoint file /home/yaosen/Dynafomer/checkpoint/model3.pt
2023-07-11 16:03:15 | INFO | dynaformer.tasks.graph_prediction | Loaded test with #samples: 285
285 285 285 285
save results to /home/yaosen/Dynafomer/checkpoint/model3_CASF2016.csv
2023-07-11 16:03:22 | INFO | __main__ | pearson_r: 0.8570905327796936
2023-07-11 16:03:22 | INFO | __main__ | r2: 0.7310564517974854
2023-07-11 16:03:22 | INFO | __main__ | mae: 0.8447534402211507
2023-07-11 16:03:22 | INFO | __main__ | mse: 1.2670351266860962
2023-07-11 16:03:22 | INFO | __main__ | mape: 0.1493150293827057
2023-07-11 16:03:22 | INFO | __main__ | smape: 0.13853947818279266
Running /home/yaosen/Dynafomer/Dynaformer/examples/evaluate/../../dynaformer/evaluate/evaluate.py
Finding checkpoint files in: /home/yaosen/Dynafomer/checkpoint
Using dataset: pdbbind:set_name=refined-set-2019-coreset-2013,cutoffs=5-5-5,seed=0
Save downloaded data to /home/yaosen/Dynafomer/data
Save csv with suffix: _CASF2013
2023-07-11 16:03:26 | WARNING | root | The OGB package is out of date. Your version is 1.3.2, while the latest version is 1.3.6.
Using backend: pytorch
Root at /home/yaosen/Dynafomer/data
2023-07-11 16:03:29 | INFO | dynaformer.models.graphormer | (Full argument list)
2023-07-11 16:03:29 | INFO | __main__ | evaluating checkpoint file /home/yaosen/Dynafomer/checkpoint/model2.pt
2023-07-11 16:03:33 | INFO | dynaformer.tasks.graph_prediction | Loaded test with #samples: 195
195 195 195 195
save results to /home/yaosen/Dynafomer/checkpoint/model2_CASF2013.csv
2023-07-11 16:03:39 | INFO | __main__ | pearson_r: 0.8627753853797913
2023-07-11 16:03:39 | INFO | __main__ | r2: 0.7440766096115112
2023-07-11 16:03:39 | INFO | __main__ | mae: 0.7362764933170416
2023-07-11 16:03:39 | INFO | __main__ | mse: 1.2900011539459229
2023-07-11 16:03:39 | INFO | __main__ | mape: 0.14594368636608124
2023-07-11 16:03:39 | INFO | __main__ | smape: 0.12844964861869812
2023-07-11 16:03:39 | INFO | __main__ | evaluating checkpoint file /home/yaosen/Dynafomer/checkpoint/model1.pt
2023-07-11 16:03:40 | INFO | dynaformer.tasks.graph_prediction | Loaded test with #samples: 195
195 195 195 195
save results to /home/yaosen/Dynafomer/checkpoint/model1_CASF2013.csv
2023-07-11 16:03:46 | INFO | __main__ | pearson_r: 0.8554542064666748
2023-07-11 16:03:46 | INFO | __main__ | r2: 0.7304592132568359
2023-07-11 16:03:46 | INFO | __main__ | mae: 0.7595089582296518
2023-07-11 16:03:46 | INFO | __main__ | mse: 1.3586406707763672
2023-07-11 16:03:46 | INFO | __main__ | mape: 0.15120257437229156
2023-07-11 16:03:46 | INFO | __main__ | smape: 0.13164567947387695
2023-07-11 16:03:46 | INFO | __main__ | evaluating checkpoint file /home/yaosen/Dynafomer/checkpoint/model3.pt
2023-07-11 16:03:46 | INFO | dynaformer.tasks.graph_prediction | Loaded test with #samples: 195
195 195 195 195
save results to /home/yaosen/Dynafomer/checkpoint/model3_CASF2013.csv
2023-07-11 16:03:52 | INFO | __main__ | pearson_r: 0.8651919960975647
2023-07-11 16:03:52 | INFO | __main__ | r2: 0.7483381032943726
2023-07-11 16:03:52 | INFO | __main__ | mae: 0.7461391962491549
2023-07-11 16:03:52 | INFO | __main__ | mse: 1.2685205936431885
2023-07-11 16:03:52 | INFO | __main__ | mape: 0.1473449319601059
2023-07-11 16:03:52 | INFO | __main__ | smape: 0.13067910075187683

So, please clone the code and follow the README.md (exactly the same steps), I believe this time there should be no such problems. I'm so sorry for the problems and if you have further questions, please just post them with no hesitation.

osession commented 1 year ago

Thank you for updating those files! I cloned the code and ran bash run_evaluate.sh again and it was able to download all the data this time, but now it is not giving the expected output or saving the .csv files. This is what I'm getting:

bash run_evaluate.sh
Running /home/ray/default/Dynaformer/examples/evaluate/../../dynaformer/evaluate/evaluate.py
Finding checkpoint files in: /home/ray/default/checkpoint
Using dataset: pdbbind:set_name=refined-set-2019-coreset-2016,cutoffs=5-5-5,seed=0
Save downloaded data to /home/ray/default/data
Save csv with suffix: _CASF2016
2023-07-11 11:11:48 | WARNING | root | The OGB package is out of date. Your version is 1.3.2, while the latest version is 1.3.6.
Using backend: pytorch
Root at /home/ray/default/data
2023-07-11 11:11:52 | INFO | dynaformer.models.graphormer | Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=True, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir='/home/ray/default/Dynaformer/dynaformer', empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='l2_loss_with_flag', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='graph_prediction_with_flag', num_workers=16, skip_invalid_size_inputs_valid_test=False, max_tokens=None, batch_size=1, required_batch_size_multiple=8, required_seq_len_multiple=1, dataset_impl=None, data_buffer_size=20, train_subset='train', valid_subset='valid', combine_valid_subsets=None, ignore_unused_valid_subsets=False, validate_interval=1, validate_interval_updates=0, validate_after_updates=0, fixed_validation_seed=None, disable_validation=False, max_tokens_valid=None, batch_size_valid=1, max_valid_steps=None, curriculum=0, gen_subset='test', num_shards=1, shard_id=0, grouped_shuffling=False, update_epoch_batch_itr=False, update_ordered_indices_seed=False, distributed_world_size=1, distributed_num_procs=1, distributed_rank=0, distributed_backend='nccl', distributed_init_method=None, distributed_port=-1, device_id=0, distributed_no_spawn=False, ddp_backend='legacy_ddp', ddp_comm_hook='none', bucket_cap_mb=25, fix_batches_to_gpus=False, find_unused_parameters=False, gradient_as_bucket_view=False, fast_stat_sync=False, heartbeat_timeout=-1, broadcast_buffers=False, slowmo_momentum=None, slowmo_base_algorithm='localsgd', localsgd_frequency=3, nprocs_per_node=1, pipeline_model_parallel=False, pipeline_balance=None, pipeline_devices=None, pipeline_chunks=0, pipeline_encoder_balance=None, pipeline_encoder_devices=None, pipeline_decoder_balance=None, pipeline_decoder_devices=None, pipeline_checkpoint='never', zero_sharding='none', no_reshard_after_forward=False, fp32_reduce_scatter=False, cpu_offload=False, use_sharded_state=False, not_fsdp_flatten_parameters=False, arch='graphormer_base', max_epoch=0, max_update=0, stop_time_hours=0, clip_norm=0.0, sentence_avg=False, update_freq=[1], lr=[0.25], stop_min_lr=-1.0, use_bmuf=False, skip_remainder_batch=False, save_dir='/home/ray/default/checkpoint', restore_file='checkpoint_last.pt', finetune_from_model=None, reset_dataloader=False, reset_lr_scheduler=False, reset_meters=False, reset_optimizer=False, optimizer_overrides='{}', save_interval=1, save_interval_updates=0, keep_interval_updates=-1, keep_interval_updates_pattern=-1, keep_last_epochs=-1, keep_best_checkpoints=-1, no_save=False, no_epoch_checkpoints=False, no_last_checkpoints=False, no_save_optimizer_state=False, best_checkpoint_metric='loss', maximize_best_checkpoint_metric=False, patience=-1, checkpoint_suffix='', checkpoint_shard_count=1, load_checkpoint_on_all_dp_ranks=False, write_checkpoints_asynchronously=False, store_ema=False, ema_decay=0.9999, ema_start_update=0, ema_seed_model=None, ema_update_freq=1, ema_fp32=False, split='test', suffix='_CASF2016', layerdrop=0.0, embed_scale=-1.0, sandwich_ln=False, dist_head='gbf3d', num_dist_head_kernel=256, num_edge_types=16384, sample_weight_estimator=False, sample_weight_estimator_pat='pdbbind', fingerprint=True, dataset_name='pdbbind:set_name=refined-set-2019-coreset-2016,cutoffs=5-5-5,seed=0', num_classes=1, max_nodes=600, dataset_source='pyg', num_atoms=4608, num_edges=1536, num_in_degree=512, num_out_degree=512, num_spatial=512, num_edge_dis=128, multi_hop_max_dist=5, spatial_pos_max=1024, edge_type='multi_hop', pretrained_model_name='none', load_pretrained_model_output_layer=False, train_epoch_shuffle=True, user_data_dir='', data_path='/home/ray/default/data', flag_m=3, flag_step_size=0.001, flag_mag=0.001, force_anneal=None, lr_shrink=0.1, warmup_updates=0, pad=1, eos=2, unk=3, encoder_layers=4, encoder_attention_heads=32, encoder_embed_dim=512, encoder_ffn_embed_dim=512, no_seed_provided=False, activation_fn='gelu', encoder_normalize_before=True, apply_graphormer_init=False, share_encoder_input_output_embed=False, no_token_positional_embeddings=False, dropout=0.1, attention_dropout=0.1, act_dropout=0.0, _name='graphormer_base')
Running /home/ray/default/Dynaformer/examples/evaluate/../../dynaformer/evaluate/evaluate.py
Finding checkpoint files in: /home/ray/default/checkpoint
Using dataset: pdbbind:set_name=refined-set-2019-coreset-2013,cutoffs=5-5-5,seed=0
Save downloaded data to /home/ray/default/data
Save csv with suffix: _CASF2013
2023-07-11 11:11:56 | WARNING | root | The OGB package is out of date. Your version is 1.3.2, while the latest version is 1.3.6.
Using backend: pytorch
Root at /home/ray/default/data
2023-07-11 11:11:59 | INFO | dynaformer.models.graphormer | Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=True, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir='/home/ray/default/Dynaformer/dynaformer', empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='l2_loss_with_flag', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='graph_prediction_with_flag', num_workers=16, skip_invalid_size_inputs_valid_test=False, max_tokens=None, batch_size=1, required_batch_size_multiple=8, required_seq_len_multiple=1, dataset_impl=None, data_buffer_size=20, train_subset='train', valid_subset='valid', combine_valid_subsets=None, ignore_unused_valid_subsets=False, validate_interval=1, validate_interval_updates=0, validate_after_updates=0, fixed_validation_seed=None, disable_validation=False, max_tokens_valid=None, batch_size_valid=1, max_valid_steps=None, curriculum=0, gen_subset='test', num_shards=1, shard_id=0, grouped_shuffling=False, update_epoch_batch_itr=False, update_ordered_indices_seed=False, distributed_world_size=1, distributed_num_procs=1, distributed_rank=0, distributed_backend='nccl', distributed_init_method=None, distributed_port=-1, device_id=0, distributed_no_spawn=False, ddp_backend='legacy_ddp', ddp_comm_hook='none', bucket_cap_mb=25, fix_batches_to_gpus=False, find_unused_parameters=False, gradient_as_bucket_view=False, fast_stat_sync=False, heartbeat_timeout=-1, broadcast_buffers=False, slowmo_momentum=None, slowmo_base_algorithm='localsgd', localsgd_frequency=3, nprocs_per_node=1, pipeline_model_parallel=False, pipeline_balance=None, pipeline_devices=None, pipeline_chunks=0, pipeline_encoder_balance=None, pipeline_encoder_devices=None, pipeline_decoder_balance=None, pipeline_decoder_devices=None, pipeline_checkpoint='never', zero_sharding='none', no_reshard_after_forward=False, fp32_reduce_scatter=False, cpu_offload=False, use_sharded_state=False, not_fsdp_flatten_parameters=False, arch='graphormer_base', max_epoch=0, max_update=0, stop_time_hours=0, clip_norm=0.0, sentence_avg=False, update_freq=[1], lr=[0.25], stop_min_lr=-1.0, use_bmuf=False, skip_remainder_batch=False, save_dir='/home/ray/default/checkpoint', restore_file='checkpoint_last.pt', finetune_from_model=None, reset_dataloader=False, reset_lr_scheduler=False, reset_meters=False, reset_optimizer=False, optimizer_overrides='{}', save_interval=1, save_interval_updates=0, keep_interval_updates=-1, keep_interval_updates_pattern=-1, keep_last_epochs=-1, keep_best_checkpoints=-1, no_save=False, no_epoch_checkpoints=False, no_last_checkpoints=False, no_save_optimizer_state=False, best_checkpoint_metric='loss', maximize_best_checkpoint_metric=False, patience=-1, checkpoint_suffix='', checkpoint_shard_count=1, load_checkpoint_on_all_dp_ranks=False, write_checkpoints_asynchronously=False, store_ema=False, ema_decay=0.9999, ema_start_update=0, ema_seed_model=None, ema_update_freq=1, ema_fp32=False, split='test', suffix='_CASF2013', layerdrop=0.0, embed_scale=-1.0, sandwich_ln=False, dist_head='gbf3d', num_dist_head_kernel=256, num_edge_types=16384, sample_weight_estimator=False, sample_weight_estimator_pat='pdbbind', fingerprint=True, dataset_name='pdbbind:set_name=refined-set-2019-coreset-2013,cutoffs=5-5-5,seed=0', num_classes=1, max_nodes=600, dataset_source='pyg', num_atoms=4608, num_edges=1536, num_in_degree=512, num_out_degree=512, num_spatial=512, num_edge_dis=128, multi_hop_max_dist=5, spatial_pos_max=1024, edge_type='multi_hop', pretrained_model_name='none', load_pretrained_model_output_layer=False, train_epoch_shuffle=True, user_data_dir='', data_path='/home/ray/default/data', flag_m=3, flag_step_size=0.001, flag_mag=0.001, force_anneal=None, lr_shrink=0.1, warmup_updates=0, pad=1, eos=2, unk=3, encoder_layers=4, encoder_attention_heads=32, encoder_embed_dim=512, encoder_ffn_embed_dim=512, no_seed_provided=False, activation_fn='gelu', encoder_normalize_before=True, apply_graphormer_init=False, share_encoder_input_output_embed=False, no_token_positional_embeddings=False, dropout=0.1, attention_dropout=0.1, act_dropout=0.0, _name='graphormer_base')

I'm getting a similar output when running ./run_custom_input.sh

./run_custom_input.sh 
Running /home/ray/default/Dynaformer/examples/evaluate/../../dynaformer/evaluate/evaluate.py
Finding checkpoint files in: /home/ray/default/checkpoint
Using dataset: custom:path=/home/ray/default/example_data/example.pkl
Save downloaded data to /home/ray/default/data
Save csv with suffix: _custom
2023-07-11 14:30:44 | WARNING | root | The OGB package is out of date. Your version is 1.3.2, while the latest version is 1.3.6.
Using backend: pytorch
Root at /home/ray/default/data
2023-07-11 14:30:47 | INFO | dynaformer.models.graphormer | Namespace(no_progress_bar=False, log_interval=100, log_format=None, log_file=None, tensorboard_logdir=None, wandb_project=None, azureml_logging=False, seed=1, cpu=False, tpu=False, bf16=False, memory_efficient_bf16=False, fp16=True, memory_efficient_fp16=False, fp16_no_flatten_grads=False, fp16_init_scale=128, fp16_scale_window=None, fp16_scale_tolerance=0.0, on_cpu_convert_precision=False, min_loss_scale=0.0001, threshold_loss_scale=None, amp=False, amp_batch_retries=2, amp_init_scale=128, amp_scale_window=None, user_dir='/home/ray/default/Dynaformer/dynaformer', empty_cache_freq=0, all_gather_list_size=16384, model_parallel_size=1, quantization_config_path=None, profile=False, reset_logging=False, suppress_crashes=False, use_plasma_view=False, plasma_path='/tmp/plasma', criterion='l2_loss_with_flag', tokenizer=None, bpe=None, optimizer=None, lr_scheduler='fixed', scoring='bleu', task='graph_prediction_with_flag', num_workers=16, skip_invalid_size_inputs_valid_test=False, max_tokens=None, batch_size=1, required_batch_size_multiple=8, required_seq_len_multiple=1, dataset_impl=None, data_buffer_size=20, train_subset='train', valid_subset='valid', combine_valid_subsets=None, ignore_unused_valid_subsets=False, validate_interval=1, validate_interval_updates=0, validate_after_updates=0, fixed_validation_seed=None, disable_validation=False, max_tokens_valid=None, batch_size_valid=1, max_valid_steps=None, curriculum=0, gen_subset='test', num_shards=1, shard_id=0, grouped_shuffling=False, update_epoch_batch_itr=False, update_ordered_indices_seed=False, distributed_world_size=1, distributed_num_procs=1, distributed_rank=0, distributed_backend='nccl', distributed_init_method=None, distributed_port=-1, device_id=0, distributed_no_spawn=False, ddp_backend='legacy_ddp', ddp_comm_hook='none', bucket_cap_mb=25, fix_batches_to_gpus=False, find_unused_parameters=False, gradient_as_bucket_view=False, fast_stat_sync=False, heartbeat_timeout=-1, broadcast_buffers=False, slowmo_momentum=None, slowmo_base_algorithm='localsgd', localsgd_frequency=3, nprocs_per_node=1, pipeline_model_parallel=False, pipeline_balance=None, pipeline_devices=None, pipeline_chunks=0, pipeline_encoder_balance=None, pipeline_encoder_devices=None, pipeline_decoder_balance=None, pipeline_decoder_devices=None, pipeline_checkpoint='never', zero_sharding='none', no_reshard_after_forward=False, fp32_reduce_scatter=False, cpu_offload=False, use_sharded_state=False, not_fsdp_flatten_parameters=False, arch='graphormer_base', max_epoch=0, max_update=0, stop_time_hours=0, clip_norm=0.0, sentence_avg=False, update_freq=[1], lr=[0.25], stop_min_lr=-1.0, use_bmuf=False, skip_remainder_batch=False, save_dir='/home/ray/default/checkpoint', restore_file='checkpoint_last.pt', finetune_from_model=None, reset_dataloader=False, reset_lr_scheduler=False, reset_meters=False, reset_optimizer=False, optimizer_overrides='{}', save_interval=1, save_interval_updates=0, keep_interval_updates=-1, keep_interval_updates_pattern=-1, keep_last_epochs=-1, keep_best_checkpoints=-1, no_save=False, no_epoch_checkpoints=False, no_last_checkpoints=False, no_save_optimizer_state=False, best_checkpoint_metric='loss', maximize_best_checkpoint_metric=False, patience=-1, checkpoint_suffix='', checkpoint_shard_count=1, load_checkpoint_on_all_dp_ranks=False, write_checkpoints_asynchronously=False, store_ema=False, ema_decay=0.9999, ema_start_update=0, ema_seed_model=None, ema_update_freq=1, ema_fp32=False, split='test', suffix='_custom', layerdrop=0.0, embed_scale=-1.0, sandwich_ln=False, dist_head='gbf3d', num_dist_head_kernel=256, num_edge_types=16384, sample_weight_estimator=False, sample_weight_estimator_pat='pdbbind', fingerprint=True, dataset_name='custom:path=/home/ray/default/example_data/example.pkl', num_classes=1, max_nodes=600, dataset_source='pyg', num_atoms=4608, num_edges=1536, num_in_degree=512, num_out_degree=512, num_spatial=512, num_edge_dis=128, multi_hop_max_dist=5, spatial_pos_max=1024, edge_type='multi_hop', pretrained_model_name='none', load_pretrained_model_output_layer=False, train_epoch_shuffle=True, user_data_dir='', data_path='/home/ray/default/data', flag_m=3, flag_step_size=0.001, flag_mag=0.001, force_anneal=None, lr_shrink=0.1, warmup_updates=0, pad=1, eos=2, unk=3, encoder_layers=4, encoder_attention_heads=32, encoder_embed_dim=512, encoder_ffn_embed_dim=512, no_seed_provided=False, activation_fn='gelu', encoder_normalize_before=True, apply_graphormer_init=False, share_encoder_input_output_embed=False, no_token_positional_embeddings=False, dropout=0.1, attention_dropout=0.1, act_dropout=0.0, _name='graphormer_base')

It's hard to tell what's going wrong since there's not any error being generated.

Thank you so much for all your help!!

osession commented 1 year ago

I just realized my mistake, when I downloaded the checkpoint files, they ended up in another subdirectory within the checkpoint directory, and I just had to move them out. Now the run_evaluate.sh runs perfectly!