ncbi / BioRED

19 stars 4 forks source link

Requirements version problem! #1

Closed li-muz closed 10 months ago

li-muz commented 1 year ago

I have some problems when reproducing your paper. I am not sure whether it is the version of the installation package, so can you provide a new Requirements.txt document with the version number of the installation package.

ptlai commented 1 year ago

Hi @li-muz ,

I apologize for the delay in responding. The following are my packages' versions:

transformers == 4.18.0
accelerate == 0.9.0
pandas == 1.1.5
numpy == 1.19.5
datasets == 2.3.2
sentencepiece != 0.1.92
protobuf == 3.19.4
spacy == 3.2.4
scispacy == 0.2.4
tensorflow-gpu == 2.6.2

If you got an error message, you could post it here. Thanks!

berkekavak commented 1 year ago

@ptlai I tried with the versions that you used. This is the error I get:

_pickle.UnpicklingError: invalid load key, 'v'.
Traceback (most recent call last):
  File "src/utils/run_biored_eval.py", line 923, in <module>
    labels                   = labels)
  File "src/utils/run_biored_eval.py", line 884, in run_test_eval
    labels                   = labels)
  File "src/utils/run_biored_eval.py", line 189, in dump_pred_2_pubtator_file
    pmids = sorted(list(pmid_2_rel_pairs_dict.keys()), reverse=True)_pickle.UnpicklingError: invalid load key, 'v'.
Traceback (most recent call last):
  File "src/utils/run_biored_eval.py", line 923, in <module>
    labels                   = labels)
  File "src/utils/run_biored_eval.py", line 884, in run_test_eval
    labels                   = labels)
  File "src/utils/run_biored_eval.py", line 189, in dump_pred_2_pubtator_file
    pmids = sorted(list(pmid_2_rel_pairs_dict.keys()), reverse=True)

Would love to get some help

ptlai commented 1 year ago

Hi @berkekavak ,

We appreciate your interest in our work. The error message looks like it failed to generate the prediction files, resulting in a crash during evaluation. Can you check if you have the following files after finishing scripts/run_biored_exp.sh?

You should receive some error message while running "python src/run_biored_exp.py" in "scripts/run_biored_exp.sh". For example, did you put the PubMedBERT model at "biored_re/"

Po-Ting

berkekavak commented 1 year ago

Hi,

Thanks for the fast response. Unfortunately the specified files are not created after running the script. The model is located under the microsoft folder here: /Users/berkekavak/biored/biored_re/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract.

Here is the complete output:

(species3) ➜ biored_re git:(master) ✗ bash run_biored_exp.sh 0

in shell script task name: biored_all_mul run_biored_exp.sh: line 6: 84714 Illegal instruction: 4 cuda_visible_devices=$cuda_visible_devices python src/run_biored_exp.py --task_name $task_name --train_file $in_data_dir/train.tsv --dev_file $in_data_dir/dev.tsv --test_file $in_data_dir/test.tsv --use_balanced_neg false --to_add_tag_as_special_token true --no_neg_for_train_dev $no_neg_for_train_dev --model_name_or_path "${pre_trained_model}" --output_dir outmodel${task_name} --num_train_epochs 10 --learning_rate 1e-5 --per_device_train_batch_size 16 --per_device_eval_batch_size 32 --do_train --do_predict --logging_steps 10 --evaluation_strategy steps --save_steps 10 --overwrite_output_dir --max_seq_length 512 cp: out_model_biored_all_mul/test_results.tsv: No such file or directory in shell script task name: biored_novelty run_biored_exp.sh: line 6: 84721 Illegal instruction: 4 cuda_visible_devices=$cuda_visible_devices python src/run_biored_exp.py --task_name $task_name --train_file $in_data_dir/train.tsv --dev_file $in_data_dir/dev.tsv --test_file $in_data_dir/test.tsv --use_balanced_neg false --to_add_tag_as_special_token true --no_neg_for_train_dev $no_neg_for_train_dev --model_name_or_path "${pre_trained_model}" --output_dir outmodel${task_name} --num_train_epochs 10 --learning_rate 1e-5 --per_device_train_batch_size 16 --per_device_eval_batch_size 32 --do_train --do_predict --logging_steps 10 --evaluation_strategy steps --save_steps 10 --overwrite_output_dir --max_seq_length 512 cp: out_model_biored_novelty/test_results.tsv: No such file or directory Traceback (most recent call last): File "src/utils/run_biored_eval.py", line 923, in labels = labels) File "src/utils/run_biored_eval.py", line 884, in run_test_eval labels = labels) File "src/utils/run_biored_eval.py", line 189, in dump_pred_2_pubtator_file pmids = sorted(list(pmid_2_rel_pairs_dict.keys()), reverse=True) AttributeError: 'NoneType' object has no attribute 'keys' datasets/biored/BioRED/Test.PubTator biored_pred_mul.txt datasets/biored/BioRED/Test.PubTator biored_pred_mul.txt datasets/biored/BioRED/Test.PubTator biored_pred_mul.txt datasets/biored/BioRED/Test.PubTator biored_pred_mul.txt

If you are available, we can do a short zoom session. It would be great since I worked a lot to run this experiment.

Best, Berke.

ptlai commented 1 year ago

Hi @berkekavak ,

The below command appears to have failed. cuda_visible_devices=$cuda_visible_devices python src/run_biored_exp.py ... Did you modify "run_biored_exp.sh" ? If it is, could you post it here? Thank you!

berkekavak commented 1 year ago

I tried the code both on UNIX (max) and Linux. I did not modify it but I guess the main issue here:

I also attached the full error log

File "/home/berkekavak/miniconda3/envs/species/lib/python3.6/site-packages/torch/serialization.py", line 762, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, 'v'. cp: cannot stat 'out_model_biored_novelty/test_results.tsv': No such file or directory Traceback (most recent call last): File "src/utils/run_biored_eval.py", line 923, in labels = labels) File "src/utils/run_biored_eval.py", line 884, in run_test_eval labels = labels) File "src/utils/run_biored_eval.py", line 189, in dump_pred_2_pubtator_file pmids = sorted(list(pmid_2_rel_pairs_dict.keys()), reverse=True) AttributeError: 'NoneType' object has no attribute 'keys'

Help would be appreciated.

Berke.

On Tue, Jan 24, 2023 at 3:00 AM ptlai @.***> wrote:

Hi @berkekavak https://github.com/berkekavak ,

The below command appears to have failed. cuda_visible_devices=$cuda_visible_devices python src/run_biored_exp.py ... Did you modify "run_biored_exp.sh" ? If it is, could you post it here? Thank you!

— Reply to this email directly, view it on GitHub https://github.com/ncbi/BioRED/issues/1#issuecomment-1401168343, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALTKNCYYPTFC2EQU4EBQI73WT4LRNANCNFSM6AAAAAAQMGNVRU . You are receiving this because you were mentioned.Message ID: @.***>

(species) @.***:/mnt/c/Users/berke/Documents/boun/biored/biored_re$ bash run_biored_exp.sh 1 2023-01-25 02:16:41.862711: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory 2023-01-25 02:16:41.862837: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. [INFO|training_args.py:804] 2023-01-25 02:16:44,767 >> using logging_steps to initialize eval_steps to 10 [INFO|training_args.py:1023] 2023-01-25 02:16:44,767 >> PyTorch: setting up devices [INFO|training_args.py:886] 2023-01-25 02:16:44,770 >> The default value for the training argument --report_to will change in v5 (from all installed integrations to none). In v5, you will need to use --report_to all to get the same behavior as now. You should start updating your code and make this info disappear :-). [INFO|training_args_tf.py:189] 2023-01-25 02:16:44,771 >> Tensorflow: setting up strategy 2023-01-25 02:16:44.772490: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory 2023-01-25 02:16:44.772599: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303) 2023-01-25 02:16:44.772671: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (DESKTOP-A21DDKP): /proc/driver/nvidia/version does not exist 2023-01-25 02:16:44.773617: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 01/25/2023 02:16:44 - INFO - main - n_replicas: 1, distributed training: False, 16-bits training: False 01/25/2023 02:16:44 - INFO - main - Training/evaluation parameters TFTrainingArguments( _n_gpu=0, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, bf16=False, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=0, dataloader_pin_memory=True, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, debug=[], deepspeed=None, disable_tqdm=False, do_eval=True, do_predict=True, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_steps=10, evaluation_strategy=IntervalStrategy.STEPS, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, gcp_project=None, gradient_accumulation_steps=1, gradient_checkpointing=False, greater_is_better=None, group_by_length=False, half_precision_backend=auto, hub_model_id=None, hub_strategy=HubStrategy.EVERY_SAVE, hub_token=, ignore_data_skip=False, label_names=None, label_smoothing_factor=0.0, learning_rate=1e-05, length_column_name=length, load_best_model_at_end=False, local_rank=-1, log_level=-1, log_level_replica=-1, log_on_each_node=True, logging_dir=out_model_biored_all_mul/runs/Jan25_02-16-44_DESKTOP-A21DDKP, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=10, logging_strategy=IntervalStrategy.STEPS, lr_scheduler_type=SchedulerType.LINEAR, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, no_cuda=False, num_train_epochs=10.0, optim=OptimizerNames.ADAMW_HF, output_dir=out_model_biored_all_mul, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=32, per_device_train_batch_size=16, poly_power=1.0, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, remove_unused_columns=True, report_to=['tensorboard'], resume_from_checkpoint=None, run_name=out_model_biored_all_mul, save_on_each_node=False, save_steps=10, save_strategy=IntervalStrategy.STEPS, save_total_limit=None, seed=42, sharded_ddp=[], skip_memory_metrics=True, tf32=None, tpu_metrics_debug=False, tpu_name=None, tpu_num_cores=None, tpu_zone=None, use_legacy_prediction_loop=False, warmup_ratio=0.0, warmup_steps=0, weight_decay=0.0, xla=False, xpu_backend=None, ) [INFO|configuration_utils.py:652] 2023-01-25 02:16:44,785 >> loading configuration file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/config.json [INFO|configuration_utils.py:690] 2023-01-25 02:16:44,786 >> Model config BertConfig { "_name_or_path": "microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract", "architectures": [ "BertForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.18.0", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 }

[INFO|tokenization_utils_base.py:1698] 2023-01-25 02:16:44,787 >> Didn't find file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/tokenizer.json. We won't load it. [INFO|tokenization_utils_base.py:1698] 2023-01-25 02:16:44,788 >> Didn't find file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/added_tokens.json. We won't load it. [INFO|tokenization_utils_base.py:1698] 2023-01-25 02:16:44,788 >> Didn't find file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/special_tokens_map.json. We won't load it. [INFO|tokenization_utils_base.py:1776] 2023-01-25 02:16:44,788 >> loading file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/vocab.txt [INFO|tokenization_utils_base.py:1776] 2023-01-25 02:16:44,788 >> loading file None [INFO|tokenization_utils_base.py:1776] 2023-01-25 02:16:44,788 >> loading file None [INFO|tokenization_utils_base.py:1776] 2023-01-25 02:16:44,788 >> loading file None [INFO|tokenization_utils_base.py:1776] 2023-01-25 02:16:44,788 >> loading file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/tokenizer_config.json [INFO|configuration_utils.py:652] 2023-01-25 02:16:44,789 >> loading configuration file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/config.json [INFO|configuration_utils.py:690] 2023-01-25 02:16:44,790 >> Model config BertConfig { "_name_or_path": "microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract", "architectures": [ "BertForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.18.0", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 }

[INFO|tokenization_utils.py:425] 2023-01-25 02:16:44,843 >> Adding @ChemicalEntitySrc$ to the vocabulary [INFO|tokenization_utils.py:425] 2023-01-25 02:16:44,843 >> Adding @ChemicalEntityTgt$ to the vocabulary [INFO|tokenization_utils.py:425] 2023-01-25 02:16:44,844 >> Adding @DiseaseOrPhenotypicFeatureSrc$ to the vocabulary [INFO|tokenization_utils.py:425] 2023-01-25 02:16:44,844 >> Adding @DiseaseOrPhenotypicFeatureTgt$ to the vocabulary [INFO|tokenization_utils.py:425] 2023-01-25 02:16:44,844 >> Adding @GeneOrGeneProductSrc$ to the vocabulary [INFO|tokenization_utils.py:425] 2023-01-25 02:16:44,844 >> Adding @GeneOrGeneProductTgt$ to the vocabulary [WARNING|logging.py:279] 2023-01-25 02:16:44,844 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. [INFO|configuration_utils.py:652] 2023-01-25 02:16:44,845 >> loading configuration file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/config.json [INFO|configuration_utils.py:690] 2023-01-25 02:16:44,846 >> Model config BertConfig { "_name_or_path": "microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract", "architectures": [ "BertForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.18.0", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 }

[WARNING|logging.py:279] 2023-01-25 02:16:44,880 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. =======================>label2id {'None': 0, 'Association': 1, 'Bind': 2, 'Comparison': 3, 'Conversion': 4, 'Cotreatment': 5, 'Drug_Interaction': 6, 'Negative_Correlation': 7, 'Positive_Correlation': 8} =======================>positive_label =======================>use_balanced_neg False =======================>max_neg_scale 2 [INFO|configuration_utils.py:652] 2023-01-25 02:16:44,883 >> loading configuration file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/config.json [INFO|configuration_utils.py:690] 2023-01-25 02:16:44,884 >> Model config BertConfig { "_name_or_path": "microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract", "architectures": [ "BertForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "finetuning_task": "text-classification", "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "id2label": { "0": "None", "1": "Association", "2": "Bind", "3": "Comparison", "4": "Conversion", "5": "Cotreatment", "6": "Drug_Interaction", "7": "Negative_Correlation", "8": "Positive_Correlation" }, "initializer_range": 0.02, "intermediate_size": 3072, "label2id": { "Association": 1, "Bind": 2, "Comparison": 3, "Conversion": 4, "Cotreatment": 5, "Drug_Interaction": 6, "Negative_Correlation": 7, "None": 0, "Positive_Correlation": 8 }, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.18.0", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 }

[INFO|modeling_tf_utils.py:1776] 2023-01-25 02:16:44,921 >> loading weights file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/pytorch_model.bin [INFO|modeling_tf_pytorch_utils.py:119] 2023-01-25 02:16:45,133 >> Loading PyTorch weights from /mnt/c/Users/berke/Documents/boun/biored/biored_re/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/pytorch_model.bin Traceback (most recent call last): File "src/run_biored_exp.py", line 795, in main() File "src/run_biored_exp.py", line 624, in main cache_dir = model_args.cache_dir, File "/home/berkekavak/miniconda3/envs/species/lib/python3.6/site-packages/transformers/models/auto/auto_factory.py", line 446, in from_pretrained return model_class.from_pretrained(pretrained_model_name_or_path, *model_args, config=config, kwargs) File "/home/berkekavak/miniconda3/envs/species/lib/python3.6/site-packages/transformers/modeling_tf_utils.py", line 1796, in from_pretrained return load_pytorch_checkpoint_in_tf2_model(model, resolved_archive_file, allow_missing_keys=True) File "/home/berkekavak/miniconda3/envs/species/lib/python3.6/site-packages/transformers/modeling_tf_pytorch_utils.py", line 121, in load_pytorch_checkpoint_in_tf2_model pt_state_dict = torch.load(pt_path, map_location="cpu") File "/home/berkekavak/miniconda3/envs/species/lib/python3.6/site-packages/torch/serialization.py", line 593, in load return _legacy_load(opened_file, map_location, pickle_module, pickle_load_args) File "/home/berkekavak/miniconda3/envs/species/lib/python3.6/site-packages/torch/serialization.py", line 762, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, 'v'. cp: cannot stat 'out_model_biored_all_mul/test_results.tsv': No such file or directory 2023-01-25 02:16:48.427892: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory 2023-01-25 02:16:48.427976: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. [INFO|training_args.py:804] 2023-01-25 02:16:51,245 >> using logging_steps to initialize eval_steps to 10 [INFO|training_args.py:1023] 2023-01-25 02:16:51,246 >> PyTorch: setting up devices [INFO|training_args.py:886] 2023-01-25 02:16:51,248 >> The default value for the training argument --report_to will change in v5 (from all installed integrations to none). In v5, you will need to use --report_to all to get the same behavior as now. You should start updating your code and make this info disappear :-). [INFO|training_args_tf.py:189] 2023-01-25 02:16:51,249 >> Tensorflow: setting up strategy 2023-01-25 02:16:51.250884: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory 2023-01-25 02:16:51.250988: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303) 2023-01-25 02:16:51.251045: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (DESKTOP-A21DDKP): /proc/driver/nvidia/version does not exist 2023-01-25 02:16:51.252089: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 01/25/2023 02:16:51 - INFO - main - n_replicas: 1, distributed training: False, 16-bits training: False 01/25/2023 02:16:51 - INFO - main - Training/evaluation parameters TFTrainingArguments( _n_gpu=0, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, bf16=False, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=0, dataloader_pin_memory=True, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, debug=[], deepspeed=None, disable_tqdm=False, do_eval=True, do_predict=True, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_steps=10, evaluation_strategy=IntervalStrategy.STEPS, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, gcp_project=None, gradient_accumulation_steps=1, gradient_checkpointing=False, greater_is_better=None, group_by_length=False, half_precision_backend=auto, hub_model_id=None, hub_strategy=HubStrategy.EVERY_SAVE, hub_token=, ignore_data_skip=False, label_names=None, label_smoothing_factor=0.0, learning_rate=1e-05, length_column_name=length, load_best_model_at_end=False, local_rank=-1, log_level=-1, log_level_replica=-1, log_on_each_node=True, logging_dir=out_model_biored_novelty/runs/Jan25_02-16-51_DESKTOP-A21DDKP, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=10, logging_strategy=IntervalStrategy.STEPS, lr_scheduler_type=SchedulerType.LINEAR, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, no_cuda=False, num_train_epochs=10.0, optim=OptimizerNames.ADAMW_HF, output_dir=out_model_biored_novelty, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=32, per_device_train_batch_size=16, poly_power=1.0, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, remove_unused_columns=True, report_to=['tensorboard'], resume_from_checkpoint=None, run_name=out_model_biored_novelty, save_on_each_node=False, save_steps=10, save_strategy=IntervalStrategy.STEPS, save_total_limit=None, seed=42, sharded_ddp=[], skip_memory_metrics=True, tf32=None, tpu_metrics_debug=False, tpu_name=None, tpu_num_cores=None, tpu_zone=None, use_legacy_prediction_loop=False, warmup_ratio=0.0, warmup_steps=0, weight_decay=0.0, xla=False, xpu_backend=None, ) [INFO|configuration_utils.py:652] 2023-01-25 02:16:51,262 >> loading configuration file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/config.json [INFO|configuration_utils.py:690] 2023-01-25 02:16:51,263 >> Model config BertConfig { "_name_or_path": "microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract", "architectures": [ "BertForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.18.0", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 }

[INFO|tokenization_utils_base.py:1698] 2023-01-25 02:16:51,264 >> Didn't find file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/tokenizer.json. We won't load it. [INFO|tokenization_utils_base.py:1698] 2023-01-25 02:16:51,264 >> Didn't find file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/added_tokens.json. We won't load it. [INFO|tokenization_utils_base.py:1698] 2023-01-25 02:16:51,265 >> Didn't find file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/special_tokens_map.json. We won't load it. [INFO|tokenization_utils_base.py:1776] 2023-01-25 02:16:51,265 >> loading file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/vocab.txt [INFO|tokenization_utils_base.py:1776] 2023-01-25 02:16:51,265 >> loading file None [INFO|tokenization_utils_base.py:1776] 2023-01-25 02:16:51,265 >> loading file None [INFO|tokenization_utils_base.py:1776] 2023-01-25 02:16:51,265 >> loading file None [INFO|tokenization_utils_base.py:1776] 2023-01-25 02:16:51,265 >> loading file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/tokenizer_config.json [INFO|configuration_utils.py:652] 2023-01-25 02:16:51,266 >> loading configuration file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/config.json [INFO|configuration_utils.py:690] 2023-01-25 02:16:51,267 >> Model config BertConfig { "_name_or_path": "microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract", "architectures": [ "BertForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.18.0", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 }

[INFO|tokenization_utils.py:425] 2023-01-25 02:16:51,317 >> Adding @ChemicalEntitySrc$ to the vocabulary [INFO|tokenization_utils.py:425] 2023-01-25 02:16:51,318 >> Adding @ChemicalEntityTgt$ to the vocabulary [INFO|tokenization_utils.py:425] 2023-01-25 02:16:51,318 >> Adding @DiseaseOrPhenotypicFeatureSrc$ to the vocabulary [INFO|tokenization_utils.py:425] 2023-01-25 02:16:51,318 >> Adding @DiseaseOrPhenotypicFeatureTgt$ to the vocabulary [INFO|tokenization_utils.py:425] 2023-01-25 02:16:51,318 >> Adding @GeneOrGeneProductSrc$ to the vocabulary [INFO|tokenization_utils.py:425] 2023-01-25 02:16:51,318 >> Adding @GeneOrGeneProductTgt$ to the vocabulary [WARNING|logging.py:279] 2023-01-25 02:16:51,319 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. [INFO|configuration_utils.py:652] 2023-01-25 02:16:51,320 >> loading configuration file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/config.json [INFO|configuration_utils.py:690] 2023-01-25 02:16:51,321 >> Model config BertConfig { "_name_or_path": "microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract", "architectures": [ "BertForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.18.0", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 }

[WARNING|logging.py:279] 2023-01-25 02:16:51,354 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. =======================>label2id {'None': 0, 'No': 1, 'Novel': 2} =======================>positive_label =======================>use_balanced_neg False =======================>max_neg_scale 2 [INFO|configuration_utils.py:652] 2023-01-25 02:16:51,357 >> loading configuration file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/config.json [INFO|configuration_utils.py:690] 2023-01-25 02:16:51,358 >> Model config BertConfig { "_name_or_path": "microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract", "architectures": [ "BertForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "finetuning_task": "text-classification", "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "id2label": { "0": "None", "1": "No", "2": "Novel" }, "initializer_range": 0.02, "intermediate_size": 3072, "label2id": { "No": 1, "None": 0, "Novel": 2 }, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.18.0", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 }

[INFO|modeling_tf_utils.py:1776] 2023-01-25 02:16:51,391 >> loading weights file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/pytorch_model.bin [INFO|modeling_tf_pytorch_utils.py:119] 2023-01-25 02:16:51,609 >> Loading PyTorch weights from /mnt/c/Users/berke/Documents/boun/biored/biored_re/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/pytorch_model.bin Traceback (most recent call last): File "src/run_biored_exp.py", line 795, in main() File "src/run_biored_exp.py", line 624, in main cache_dir = model_args.cache_dir, File "/home/berkekavak/miniconda3/envs/species/lib/python3.6/site-packages/transformers/models/auto/auto_factory.py", line 446, in from_pretrained return model_class.from_pretrained(pretrained_model_name_or_path, *model_args, config=config, kwargs) File "/home/berkekavak/miniconda3/envs/species/lib/python3.6/site-packages/transformers/modeling_tf_utils.py", line 1796, in from_pretrained return load_pytorch_checkpoint_in_tf2_model(model, resolved_archive_file, allow_missing_keys=True) File "/home/berkekavak/miniconda3/envs/species/lib/python3.6/site-packages/transformers/modeling_tf_pytorch_utils.py", line 121, in load_pytorch_checkpoint_in_tf2_model pt_state_dict = torch.load(pt_path, map_location="cpu") File "/home/berkekavak/miniconda3/envs/species/lib/python3.6/site-packages/torch/serialization.py", line 593, in load return _legacy_load(opened_file, map_location, pickle_module, pickle_load_args) File "/home/berkekavak/miniconda3/envs/species/lib/python3.6/site-packages/torch/serialization.py", line 762, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, 'v'. cp: cannot stat 'out_model_biored_novelty/test_results.tsv': No such file or directory Traceback (most recent call last): File "src/utils/run_biored_eval.py", line 923, in labels = labels) File "src/utils/run_biored_eval.py", line 884, in run_test_eval labels = labels) File "src/utils/run_biored_eval.py", line 189, in dump_pred_2_pubtator_file pmids = sorted(list(pmid_2_rel_pairs_dict.keys()), reverse=True) AttributeError: 'NoneType' object has no attribute 'keys' datasets/biored/BioRED/Test.PubTator biored_pred_mul.txt datasets/biored/BioRED/Test.PubTator biored_pred_mul.txt datasets/biored/BioRED/Test.PubTator biored_pred_mul.txt datasets/biored/BioRED/Test.PubTator biored_pred_mul.txt

ptlai commented 1 year ago

Hi @berkekavak ,

Thanks.

There is another problem I found. Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory You can try the following commands if you are unable to access the GPU. conda install -c conda-forge cudatoolkit=11.1 conda install -c conda-forge cudnn=8.2.1

However, the GPU error does not appear to be the cause of the below error. Traceback (most recent call last): File "src/run_biored_exp.py", line 795, in <module> main() File "src/run_biored_exp.py", line 624, in main cache_dir = model_args.cache_dir, File "/home/berkekavak/miniconda3/envs/species/lib/python3.6/site-packages/transformers/models/auto/auto_factory.py", line 446, in from_pretrained Could you please share the Python packages you have installed? Thank you.

berkekavak commented 1 year ago

I tried to execute the code:

1) on a Windows (WSL Ubuntu 20.04 LTS) 2) on a Macbook Pro M1 3) on an Intel Mac (2017)

I got similar errors on those devices, with and without using a GPU. Which is about the labels. I guess the problem is related to the pretrained packages. Latest PubMedBERT and compatibility issues of this model might be the problem. I tried to Find the python versions (requirements.txt) of my environment attached.

Best, Berke.

On Wed, Jan 25, 2023 at 3:31 AM ptlai @.***> wrote:

Hi @berkekavak https://github.com/berkekavak ,

Thanks.

There is another problem I found. Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory You can try the following commands if you are unable to access the GPU. conda install -c conda-forge cudatoolkit=11.1 conda install -c conda-forge cudnn=8.2.1

However, the GPU error does not appear to be the cause of the below error. Traceback (most recent call last): File "src/run_biored_exp.py", line 795, in main() File "src/run_biored_exp.py", line 624, in main cache_dir = model_args.cache_dir, File "/home/berkekavak/miniconda3/envs/species/lib/python3.6/site-packages/transformers/models/auto/auto_factory.py", line 446, in from_pretrained Could you please share the Python packages you have installed? Thank you.

— Reply to this email directly, view it on GitHub https://github.com/ncbi/BioRED/issues/1#issuecomment-1402879249, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALTKNCZ62EKOB4M3MUW62ULWUBX73ANCNFSM6AAAAAAQMGNVRU . You are receiving this because you were mentioned.Message ID: @.***>

absl-py==0.15.0 accelerate==0.9.0 aiohttp==3.8.3 aiosignal==1.2.0 astunparse==1.6.3 async-timeout==4.0.2 asynctest==0.13.0 attrs==22.2.0 awscli==1.24.10 blis==0.7.9 botocore==1.26.10 cached-property==1.5.2 cachetools==4.2.4 catalogue==2.0.8 certifi==2021.5.30 charset-normalizer==2.0.12 clang==5.0 click==8.0.4 colorama==0.4.4 conllu==4.5.2 contextvars==2.4 cymem==2.0.7 dataclasses==0.8 datasets==2.3.2 dill==0.3.4 docutils==0.16 en-core-sci-md @ https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.5.0/en_core_sci_md-0.5.0.tar.gz filelock==3.4.1 flatbuffers==1.12 frozenlist==1.2.0 fsspec==2022.1.0 gast==0.4.0 google-auth==1.35.0 google-auth-oauthlib==0.4.6 google-pasta==0.2.0 grpcio==1.48.2 h5py==3.1.0 huggingface-hub==0.4.0 idna==3.4 idna-ssl==1.1.0 immutables==0.19 importlib-metadata==4.8.3 importlib-resources==5.4.0 Jinja2==3.0.3 jmespath==0.10.0 joblib==1.1.1 keras==2.6.0 Keras-Preprocessing==1.1.2 langcodes==3.3.0 Markdown==3.3.7 MarkupSafe==2.0.1 multidict==5.2.0 multiprocess==0.70.12.2 murmurhash==1.0.9 nmslib==2.1.1 numpy==1.19.5 oauthlib==3.2.2 opt-einsum==3.3.0 packaging==21.3 pandas==1.1.5 pathy==0.10.1 preshed==3.0.8 protobuf==3.19.4 psutil==5.9.4 pyarrow==6.0.1 pyasn1==0.4.8 pyasn1-modules==0.2.8 pybind11==2.6.1 pydantic==1.8.2 pyparsing==3.0.9 pysbd==0.3.4 python-dateutil==2.8.2 pytz==2022.7 PyYAML==5.4.1 regex==2022.10.31 requests==2.27.1 requests-oauthlib==1.3.1 responses==0.17.0 rsa==4.7.2 s3transfer==0.5.2 sacremoses==0.0.53 scikit-learn==0.24.2 scipy==1.5.4 scispacy==0.2.4 sentencepiece==0.1.97 six==1.15.0 smart-open==6.3.0 spacy==3.2.4 spacy-legacy==3.0.11 spacy-loggers==1.0.4 srsly==2.4.5 tensorboard==2.6.0 tensorboard-data-server==0.6.1 tensorboard-plugin-wit==1.8.1 tensorflow==2.6.2 tensorflow-estimator==2.6.0 tensorflow-gpu==2.6.2 termcolor==1.1.0 thinc==8.0.17 threadpoolctl==3.1.0 tokenizers==0.12.1 torch==1.8.0 tqdm==4.64.1 transformers==4.18.0 typer==0.4.2 typing-extensions==3.7.4.3 urllib3==1.26.14 wasabi==0.10.1 Werkzeug==2.0.3 wrapt==1.12.1 xxhash==3.2.0 yarl==1.7.2 zipp==3.6.0

ptlai commented 1 year ago

Hi @berkekavak ,

Your packages and the latest version of https://huggingface.co/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/tree/main worked for me. Below is the environment and message I received. Packakges

截圖 2023-01-25 下午11 34 49

Snippet

截圖 2023-01-25 下午11 39 20

I am unable to reproduce your error message, however. The code is still compatible with the latest pre-trained model. Code is tested on CentOS Linux release 7.5.1804 (Core), and I didn't test it on your OS. I'm not sure if it's a problem.

BTW, can you run the code of https://github.com/huggingface/transformers/tree/main/examples/tensorflow/text-classification ? You should also be able to run our code if you can. Please let me know what version of Python and packages you use to run it. It should be possible for me to test and then update the biored code to support your versions. I have another version of biored code that supports Python 3.9, recent tensorflow and transformers.

berkekavak commented 1 year ago

Maybe there is an incompatibility with Python 3.6 and Transformers 4.18.0. I tried the example transformers code you mentioned:

(species) @.***:/mnt/c/Users/berke/Documents/transformers/examples/tensorflow/text-classification$ python run_text_classification.py Traceback (most recent call last): File "run_text_classification.py", line 41, in from transformers.utils import CONFIG_NAME, TF2_WEIGHTS_NAME, send_example_telemetry ImportError: cannot import name 'send_example_telemetry'

I am using Python 3.6.13 to run the code with the mentioned requirements txt.

Could you please send your requirements.txt and python version so that I can try creating a conda env? Could you also please send the code that supports Python 3.9?

I am attaching a screenshot of my files in case you want to check the directories. Best, Berke.

On Wed, Jan 25, 2023 at 7:18 PM ptlai @.***> wrote:

Hi @berkekavak https://github.com/berkekavak ,

Your packages and the latest version of https://huggingface.co/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/tree/main worked for me. Below is the environment and message I received. Packakges [image: 截圖 2023-01-25 下午11 34 49] https://user-images.githubusercontent.com/61985809/214606748-1f135b21-833b-4df6-8ad5-888202d3f8e5.png Snippet [image: 截圖 2023-01-25 下午11 39 20] https://user-images.githubusercontent.com/61985809/214607141-eefe6283-2cf6-49f0-8adb-5816505bc615.png

I am unable to reproduce your error message, however. The code is still compatible with the latest pre-trained model. Code is tested on CentOS Linux release 7.5.1804 (Core), and I didn't test it on your OS. I'm not sure if it's a problem.

BTW, can you run the code of https://github.com/huggingface/transformers/tree/main/examples/tensorflow/text-classification ? You should also be able to run our code if you can. Please let me know what version of Python and packages you use to run it. It should be possible for me to test and then update the biored code to support your versions. I have another version of biored code that supports Python 3.9, recent tensorflow and transformers.

— Reply to this email directly, view it on GitHub https://github.com/ncbi/BioRED/issues/1#issuecomment-1403871101, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALTKNC3Z22AFIJ3JRESWQWLWUFG5HANCNFSM6AAAAAAQMGNVRU . You are receiving this because you were mentioned.Message ID: @.***>

ptlai commented 1 year ago

Hi @berkekavak ,

I tested the requirements.txt that you mentioned, but I used python 3.6.15.

I tried to execute the code: 1) on a Windows (WSL Ubuntu 20.04 LTS) 2) on a Macbook Pro M1 3) on an Intel Mac (2017) I got similar errors on those devices, with and without using a GPU. Which is about the labels. I guess the problem is related to the pretrained packages. Latest PubMedBERT and compatibility issues of this model might be the problem. I tried to Find the python versions (requirements.txt) of my environment attached. Best, Berke. On Wed, Jan 25, 2023 at 3:31 AM ptlai @.> wrote: Hi @berkekavak https://github.com/berkekavak , Thanks. There is another problem I found. Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory You can try the following commands if you are unable to access the GPU. conda install -c conda-forge cudatoolkit=11.1 conda install -c conda-forge cudnn=8.2.1 However, the GPU error does not appear to be the cause of the below error. Traceback (most recent call last): File "src/run_biored_exp.py", line 795, in main() File "src/run_biored_exp.py", line 624, in main cache_dir = model_args.cache_dir, File "/home/berkekavak/miniconda3/envs/species/lib/python3.6/site-packages/transformers/models/auto/auto_factory.py", line 446, in from_pretrained Could you please share the Python packages you have installed? Thank you. — Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALTKNCZ62EKOB4M3MUW62ULWUBX73ANCNFSM6AAAAAAQMGNVRU . You are receiving this because you were mentioned.Message ID: @.> absl-py==0.15.0 accelerate==0.9.0 aiohttp==3.8.3 aiosignal==1.2.0 astunparse==1.6.3 async-timeout==4.0.2 asynctest==0.13.0 attrs==22.2.0 awscli==1.24.10 blis==0.7.9 botocore==1.26.10 cached-property==1.5.2 cachetools==4.2.4 catalogue==2.0.8 certifi==2021.5.30 charset-normalizer==2.0.12 clang==5.0 click==8.0.4 colorama==0.4.4 conllu==4.5.2 contextvars==2.4 cymem==2.0.7 dataclasses==0.8 datasets==2.3.2 dill==0.3.4 docutils==0.16 en-core-sci-md @ https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.5.0/en_core_sci_md-0.5.0.tar.gz filelock==3.4.1 flatbuffers==1.12 frozenlist==1.2.0 fsspec==2022.1.0 gast==0.4.0 google-auth==1.35.0 google-auth-oauthlib==0.4.6 google-pasta==0.2.0 grpcio==1.48.2 h5py==3.1.0 huggingface-hub==0.4.0 idna==3.4 idna-ssl==1.1.0 immutables==0.19 importlib-metadata==4.8.3 importlib-resources==5.4.0 Jinja2==3.0.3 jmespath==0.10.0 joblib==1.1.1 keras==2.6.0 Keras-Preprocessing==1.1.2 langcodes==3.3.0 Markdown==3.3.7 MarkupSafe==2.0.1 multidict==5.2.0 multiprocess==0.70.12.2 murmurhash==1.0.9 nmslib==2.1.1 numpy==1.19.5 oauthlib==3.2.2 opt-einsum==3.3.0 packaging==21.3 pandas==1.1.5 pathy==0.10.1 preshed==3.0.8 protobuf==3.19.4 psutil==5.9.4 pyarrow==6.0.1 pyasn1==0.4.8 pyasn1-modules==0.2.8 pybind11==2.6.1 pydantic==1.8.2 pyparsing==3.0.9 pysbd==0.3.4 python-dateutil==2.8.2 pytz==2022.7 PyYAML==5.4.1 regex==2022.10.31 requests==2.27.1 requests-oauthlib==1.3.1 responses==0.17.0 rsa==4.7.2 s3transfer==0.5.2 sacremoses==0.0.53 scikit-learn==0.24.2 scipy==1.5.4 scispacy==0.2.4 sentencepiece==0.1.97 six==1.15.0 smart-open==6.3.0 spacy==3.2.4 spacy-legacy==3.0.11 spacy-loggers==1.0.4 srsly==2.4.5 tensorboard==2.6.0 tensorboard-data-server==0.6.1 tensorboard-plugin-wit==1.8.1 tensorflow==2.6.2 tensorflow-estimator==2.6.0 tensorflow-gpu==2.6.2 termcolor==1.1.0 thinc==8.0.17 threadpoolctl==3.1.0 tokenizers==0.12.1 torch==1.8.0 tqdm==4.64.1 transformers==4.18.0 typer==0.4.2 typing-extensions==3.7.4.3 urllib3==1.26.14 wasabi==0.10.1 Werkzeug==2.0.3 wrapt==1.12.1 xxhash==3.2.0 yarl==1.7.2 zipp==3.6.0

However, as I mentioned earlier, it works on our server. The error may be caused by cuda or GPU driver rather than requirements.txt, but I am not sure. I would appreciate it if you could also try the hugginggface transformer's sample code, and let me know if it works. I am traveling now, and I will return next Monday. After that, I should be able to send you the python 3.9 version of biored_re.

berkekavak commented 1 year ago

The biored code currently available at the repo has some control characters issues and needs a slight modification. I copied the script (run_biored_exp.sh) into the biored_re directory (instead of biored_re/scripts) and then executed the code by:

bash run_biored_exp.sh 0 (for my Mac)

I also tested the example code that you sent. Working on this issue for almost 3 weeks. I wish we could schedule a short zoom session. Or I can try the new biored code for python 3.9. After that, I can contact you again.

I wish you a safe journey. Many thanks for your answers.

On Wed, Jan 25, 2023 at 8:35 PM ptlai @.***> wrote:

Hi @berkekavak https://github.com/berkekavak ,

I tested the requirements.txt that you mentioned, but I used python 3.6.15.

I tried to execute the code: 1) on a Windows (WSL Ubuntu 20.04 LTS) 2) on a Macbook Pro M1 3) on an Intel Mac (2017) I got similar errors on those devices, with and without using a GPU. Which is about the labels. I guess the problem is related to the pretrained packages. Latest PubMedBERT and compatibility issues of this model might be the problem. I tried to Find the python versions (requirements.txt) of my environment attached. Best, Berke. On Wed, Jan 25, 2023 at 3:31 AM ptlai @.> wrote: Hi @berkekavak https://github.com/berkekavak https://github.com/berkekavak https://github.com/berkekavak , Thanks. There is another problem I found. Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory You can try the following commands if you are unable to access the GPU. conda install -c conda-forge cudatoolkit=11.1 conda install -c conda-forge cudnn=8.2.1 However, the GPU error does not appear to be the cause of the below error. Traceback (most recent call last): File "src/run_biored_exp.py", line 795, in main() File "src/run_biored_exp.py", line 624, in main cache_dir = model_args.cache_dir, File "/home/berkekavak/miniconda3/envs/species/lib/python3.6/site-packages/transformers/models/auto/auto_factory.py", line 446, in from_pretrained Could you please share the Python packages you have installed? Thank you. — Reply to this email directly, view it on GitHub <#1 (comment) https://github.com/ncbi/BioRED/issues/1#issuecomment-1402879249>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALTKNCZ62EKOB4M3MUW62ULWUBX73ANCNFSM6AAAAAAQMGNVRU https://github.com/notifications/unsubscribe-auth/ALTKNCZ62EKOB4M3MUW62ULWUBX73ANCNFSM6AAAAAAQMGNVRU . You are receiving this because you were mentioned.Message ID: @.> absl-py==0.15.0 accelerate==0.9.0 aiohttp==3.8.3 aiosignal==1.2.0 astunparse==1.6.3 async-timeout==4.0.2 asynctest==0.13.0 attrs==22.2.0 awscli==1.24.10 blis==0.7.9 botocore==1.26.10 cached-property==1.5.2 cachetools==4.2.4 catalogue==2.0.8 certifi==2021.5.30 charset-normalizer==2.0.12 clang==5.0 click==8.0.4 colorama==0.4.4 conllu==4.5.2 contextvars==2.4 cymem==2.0.7 dataclasses==0.8 datasets==2.3.2 dill==0.3.4 docutils==0.16 en-core-sci-md @ https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.5.0/en_core_sci_md-0.5.0.tar.gz filelock==3.4.1 flatbuffers==1.12 frozenlist==1.2.0 fsspec==2022.1.0 gast==0.4.0 google-auth==1.35.0 google-auth-oauthlib==0.4.6 google-pasta==0.2.0 grpcio==1.48.2 h5py==3.1.0 huggingface-hub==0.4.0 idna==3.4 idna-ssl==1.1.0 immutables==0.19 importlib-metadata==4.8.3 importlib-resources==5.4.0 Jinja2==3.0.3 jmespath==0.10.0 joblib==1.1.1 keras==2.6.0 Keras-Preprocessing==1.1.2 langcodes==3.3.0 Markdown==3.3.7 MarkupSafe==2.0.1 multidict==5.2.0 multiprocess==0.70.12.2 murmurhash==1.0.9 nmslib==2.1.1 numpy==1.19.5 oauthlib==3.2.2 opt-einsum==3.3.0 packaging==21.3 pandas==1.1.5 pathy==0.10.1 preshed==3.0.8 protobuf==3.19.4 psutil==5.9.4 pyarrow==6.0.1 pyasn1==0.4.8 pyasn1-modules==0.2.8 pybind11==2.6.1 pydantic==1.8.2 pyparsing==3.0.9 pysbd==0.3.4 python-dateutil==2.8.2 pytz==2022.7 PyYAML==5.4.1 regex==2022.10.31 requests==2.27.1 requests-oauthlib==1.3.1 responses==0.17.0 rsa==4.7.2 s3transfer==0.5.2 sacremoses==0.0.53 scikit-learn==0.24.2 scipy==1.5.4 scispacy==0.2.4 sentencepiece==0.1.97 six==1.15.0 smart-open==6.3.0 spacy==3.2.4 spacy-legacy==3.0.11 spacy-loggers==1.0.4 srsly==2.4.5 tensorboard==2.6.0 tensorboard-data-server==0.6.1 tensorboard-plugin-wit==1.8.1 tensorflow==2.6.2 tensorflow-estimator==2.6.0 tensorflow-gpu==2.6.2 termcolor==1.1.0 thinc==8.0.17 threadpoolctl==3.1.0 tokenizers==0.12.1 torch==1.8.0 tqdm==4.64.1 transformers==4.18.0 typer==0.4.2 typing-extensions==3.7.4.3 urllib3==1.26.14 wasabi==0.10.1 Werkzeug==2.0.3 wrapt==1.12.1 xxhash==3.2.0 yarl==1.7.2 zipp==3.6.0

However, as I mentioned earlier, it works on our server. The error may be caused by cuda or GPU driver rather than requirements.txt, but I am not sure. I would appreciate it if you could also try the hugginggface transformer's sample code, and let me know if it works. I am traveling now, and I will return next Monday. After that, I should be able to send you the python 3.9 version of biored_re.

— Reply to this email directly, view it on GitHub https://github.com/ncbi/BioRED/issues/1#issuecomment-1403988365, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALTKNC6N3FOBYQTSNUIHVNLWUFP5RANCNFSM6AAAAAAQMGNVRU . You are receiving this because you were mentioned.Message ID: @.***>

(tensorflow) ➜ biored_re git:(master) ✗ bash run_biored_exp.sh 0 in shell script task name: biored_all_mul [INFO|training_args.py:1094] 2023-01-26 17:01:28,717 >> using logging_steps to initialize eval_steps to 10 [INFO|training_args.py:1230] 2023-01-26 17:01:28,717 >> The default value for the training argument --report_to will change in v5 (from all installed integrations to none). In v5, you will need to use --report_to all to get the same behavior as now. You should start updating your code and make this info disappear :-). [INFO|training_args_tf.py:190] 2023-01-26 17:01:28,718 >> Tensorflow: setting up strategy Metal device set to: Apple M1 Pro

systemMemory: 16.00 GB maxCacheSize: 5.33 GB

2023-01-26 17:01:28.718880: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2023-01-26 17:01:28.718898: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: ) 01/26/2023 17:01:28 - INFO - main - n_replicas: 1, distributed training: False, 16-bits training: False 01/26/2023 17:01:28 - INFO - main - Training/evaluation parameters TFTrainingArguments( _n_gpu=-1, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, bf16=False, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=0, dataloader_pin_memory=True, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=None, disable_tqdm=False, do_eval=True, do_predict=True, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_steps=10, evaluation_strategy=steps, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gcp_project=None, gradient_accumulation_steps=1, gradient_checkpointing=False, greater_is_better=None, group_by_length=False, half_precision_backend=auto, hub_model_id=None, hub_private_repo=False, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_inputs_for_metrics=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=1e-05, length_column_name=length, load_best_model_at_end=False, local_rank=-1, log_level=passive, log_level_replica=passive, log_on_each_node=True, logging_dir=out_model_biored_all_mul/runs/Jan26_17-01-28_Berkes-MacBook-Pro.local, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=10, logging_strategy=steps, lr_scheduler_type=linear, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, no_cuda=False, num_train_epochs=10.0, optim=adamw_hf, optim_args=None, output_dir=out_model_biored_all_mul, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=32, per_device_train_batch_size=16, poly_power=1.0, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], resume_from_checkpoint=None, run_name=out_model_biored_all_mul, save_on_each_node=False, save_steps=10, save_strategy=steps, save_total_limit=None, seed=42, sharded_ddp=[], skip_memory_metrics=True, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torchdynamo=None, tpu_metrics_debug=False, tpu_name=None, tpu_num_cores=None, tpu_zone=None, use_ipex=False, use_legacy_prediction_loop=False, use_mps_device=False, warmup_ratio=0.0, warmup_steps=0, weight_decay=0.0, xla=False, xpu_backend=None, ) [INFO|configuration_utils.py:658] 2023-01-26 17:01:28,721 >> loading configuration file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/config.json [INFO|configuration_utils.py:712] 2023-01-26 17:01:28,723 >> Model config BertConfig { "_name_or_path": "microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract", "architectures": [ "BertForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.26.0", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 }

[INFO|tokenization_utils_base.py:1800] 2023-01-26 17:01:28,723 >> loading file vocab.txt [INFO|tokenization_utils_base.py:1800] 2023-01-26 17:01:28,723 >> loading file tokenizer.json [INFO|tokenization_utils_base.py:1800] 2023-01-26 17:01:28,723 >> loading file added_tokens.json [INFO|tokenization_utils_base.py:1800] 2023-01-26 17:01:28,724 >> loading file special_tokens_map.json [INFO|tokenization_utils_base.py:1800] 2023-01-26 17:01:28,724 >> loading file tokenizer_config.json [INFO|configuration_utils.py:658] 2023-01-26 17:01:28,724 >> loading configuration file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/config.json [INFO|configuration_utils.py:712] 2023-01-26 17:01:28,724 >> Model config BertConfig { "_name_or_path": "microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract", "architectures": [ "BertForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.26.0", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 }

[INFO|tokenization_utils.py:426] 2023-01-26 17:01:28,738 >> Adding @ChemicalEntitySrc$ to the vocabulary [INFO|tokenization_utils.py:426] 2023-01-26 17:01:28,738 >> Adding @ChemicalEntityTgt$ to the vocabulary [INFO|tokenization_utils.py:426] 2023-01-26 17:01:28,738 >> Adding @DiseaseOrPhenotypicFeatureSrc$ to the vocabulary [INFO|tokenization_utils.py:426] 2023-01-26 17:01:28,738 >> Adding @DiseaseOrPhenotypicFeatureTgt$ to the vocabulary [INFO|tokenization_utils.py:426] 2023-01-26 17:01:28,738 >> Adding @GeneOrGeneProductSrc$ to the vocabulary [INFO|tokenization_utils.py:426] 2023-01-26 17:01:28,738 >> Adding @GeneOrGeneProductTgt$ to the vocabulary [WARNING|logging.py:281] 2023-01-26 17:01:28,738 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. [INFO|configuration_utils.py:658] 2023-01-26 17:01:28,739 >> loading configuration file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/config.json [INFO|configuration_utils.py:712] 2023-01-26 17:01:28,739 >> Model config BertConfig { "_name_or_path": "microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract", "architectures": [ "BertForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.26.0", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 }

[WARNING|logging.py:281] 2023-01-26 17:01:28,748 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. =======================>label2id {'None': 0, 'Association': 1, 'Bind': 2, 'Comparison': 3, 'Conversion': 4, 'Cotreatment': 5, 'Drug_Interaction': 6, 'Negative_Correlation': 7, 'Positive_Correlation': 8} =======================>positive_label =======================>use_balanced_neg False =======================>max_neg_scale 2 [INFO|configuration_utils.py:658] 2023-01-26 17:01:28,750 >> loading configuration file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/config.json [INFO|configuration_utils.py:712] 2023-01-26 17:01:28,751 >> Model config BertConfig { "_name_or_path": "microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract", "architectures": [ "BertForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "finetuning_task": "text-classification", "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "id2label": { "0": "None", "1": "Association", "2": "Bind", "3": "Comparison", "4": "Conversion", "5": "Cotreatment", "6": "Drug_Interaction", "7": "Negative_Correlation", "8": "Positive_Correlation" }, "initializer_range": 0.02, "intermediate_size": 3072, "label2id": { "Association": 1, "Bind": 2, "Comparison": 3, "Conversion": 4, "Cotreatment": 5, "Drug_Interaction": 6, "Negative_Correlation": 7, "None": 0, "Positive_Correlation": 8 }, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.26.0", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 }

[INFO|modeling_tf_utils.py:2694] 2023-01-26 17:01:28,776 >> loading weights file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/pytorch_model.bin [INFO|modeling_tf_pytorch_utils.py:168] 2023-01-26 17:01:28,859 >> Loading PyTorch weights from /Users/berkekavak/boun/biored/biored_re/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/pytorch_model.bin Traceback (most recent call last): File "/Users/berkekavak/boun/biored/biored_re/src/run_biored_exp.py", line 795, in main() File "/Users/berkekavak/boun/biored/biored_re/src/run_biored_exp.py", line 620, in main model = TFAutoModelForSequenceClassification.from_pretrained( File "/Users/berkekavak/miniforge3/envs/tensorflow/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py", line 464, in from_pretrained return model_class.from_pretrained( File "/Users/berkekavak/miniforge3/envs/tensorflow/lib/python3.9/site-packages/transformers/modeling_tf_utils.py", line 2744, in from_pretrained return load_pytorch_checkpoint_in_tf2_model( File "/Users/berkekavak/miniforge3/envs/tensorflow/lib/python3.9/site-packages/transformers/modeling_tf_pytorch_utils.py", line 169, in load_pytorch_checkpoint_in_tf2_model pt_state_dict.update(torch.load(pt_path, map_location="cpu")) File "/Users/berkekavak/miniforge3/envs/tensorflow/lib/python3.9/site-packages/torch/serialization.py", line 795, in load return _legacy_load(opened_file, map_location, pickle_module, pickle_load_args) File "/Users/berkekavak/miniforge3/envs/tensorflow/lib/python3.9/site-packages/torch/serialization.py", line 1002, in _legacy_load magic_number = pickle_module.load(f, pickle_load_args) _pickle.UnpicklingError: invalid load key, 'v'. cp: out_model_biored_all_mul/test_results.tsv: No such file or directory in shell script task name: biored_novelty [INFO|training_args.py:1094] 2023-01-26 17:01:31,849 >> using logging_steps to initialize eval_steps to 10 [INFO|training_args.py:1230] 2023-01-26 17:01:31,849 >> The default value for the training argument --report_to will change in v5 (from all installed integrations to none). In v5, you will need to use --report_to all to get the same behavior as now. You should start updating your code and make this info disappear :-). [INFO|training_args_tf.py:190] 2023-01-26 17:01:31,850 >> Tensorflow: setting up strategy Metal device set to: Apple M1 Pro

systemMemory: 16.00 GB maxCacheSize: 5.33 GB

2023-01-26 17:01:31.850860: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support. 2023-01-26 17:01:31.850876: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: ) 01/26/2023 17:01:31 - INFO - main - n_replicas: 1, distributed training: False, 16-bits training: False 01/26/2023 17:01:31 - INFO - main - Training/evaluation parameters TFTrainingArguments( _n_gpu=-1, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, bf16=False, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=0, dataloader_pin_memory=True, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=None, disable_tqdm=False, do_eval=True, do_predict=True, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_steps=10, evaluation_strategy=steps, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gcp_project=None, gradient_accumulation_steps=1, gradient_checkpointing=False, greater_is_better=None, group_by_length=False, half_precision_backend=auto, hub_model_id=None, hub_private_repo=False, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_inputs_for_metrics=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=1e-05, length_column_name=length, load_best_model_at_end=False, local_rank=-1, log_level=passive, log_level_replica=passive, log_on_each_node=True, logging_dir=out_model_biored_novelty/runs/Jan26_17-01-31_Berkes-MacBook-Pro.local, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=10, logging_strategy=steps, lr_scheduler_type=linear, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, no_cuda=False, num_train_epochs=10.0, optim=adamw_hf, optim_args=None, output_dir=out_model_biored_novelty, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=32, per_device_train_batch_size=16, poly_power=1.0, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], resume_from_checkpoint=None, run_name=out_model_biored_novelty, save_on_each_node=False, save_steps=10, save_strategy=steps, save_total_limit=None, seed=42, sharded_ddp=[], skip_memory_metrics=True, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torchdynamo=None, tpu_metrics_debug=False, tpu_name=None, tpu_num_cores=None, tpu_zone=None, use_ipex=False, use_legacy_prediction_loop=False, use_mps_device=False, warmup_ratio=0.0, warmup_steps=0, weight_decay=0.0, xla=False, xpu_backend=None, ) [INFO|configuration_utils.py:658] 2023-01-26 17:01:31,853 >> loading configuration file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/config.json [INFO|configuration_utils.py:712] 2023-01-26 17:01:31,855 >> Model config BertConfig { "_name_or_path": "microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract", "architectures": [ "BertForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.26.0", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 }

[INFO|tokenization_utils_base.py:1800] 2023-01-26 17:01:31,855 >> loading file vocab.txt [INFO|tokenization_utils_base.py:1800] 2023-01-26 17:01:31,855 >> loading file tokenizer.json [INFO|tokenization_utils_base.py:1800] 2023-01-26 17:01:31,855 >> loading file added_tokens.json [INFO|tokenization_utils_base.py:1800] 2023-01-26 17:01:31,855 >> loading file special_tokens_map.json [INFO|tokenization_utils_base.py:1800] 2023-01-26 17:01:31,855 >> loading file tokenizer_config.json [INFO|configuration_utils.py:658] 2023-01-26 17:01:31,855 >> loading configuration file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/config.json [INFO|configuration_utils.py:712] 2023-01-26 17:01:31,856 >> Model config BertConfig { "_name_or_path": "microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract", "architectures": [ "BertForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.26.0", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 }

[INFO|tokenization_utils.py:426] 2023-01-26 17:01:31,870 >> Adding @ChemicalEntitySrc$ to the vocabulary [INFO|tokenization_utils.py:426] 2023-01-26 17:01:31,870 >> Adding @ChemicalEntityTgt$ to the vocabulary [INFO|tokenization_utils.py:426] 2023-01-26 17:01:31,870 >> Adding @DiseaseOrPhenotypicFeatureSrc$ to the vocabulary [INFO|tokenization_utils.py:426] 2023-01-26 17:01:31,870 >> Adding @DiseaseOrPhenotypicFeatureTgt$ to the vocabulary [INFO|tokenization_utils.py:426] 2023-01-26 17:01:31,870 >> Adding @GeneOrGeneProductSrc$ to the vocabulary [INFO|tokenization_utils.py:426] 2023-01-26 17:01:31,870 >> Adding @GeneOrGeneProductTgt$ to the vocabulary [WARNING|logging.py:281] 2023-01-26 17:01:31,870 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. [INFO|configuration_utils.py:658] 2023-01-26 17:01:31,871 >> loading configuration file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/config.json [INFO|configuration_utils.py:712] 2023-01-26 17:01:31,871 >> Model config BertConfig { "_name_or_path": "microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract", "architectures": [ "BertForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.26.0", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 }

[WARNING|logging.py:281] 2023-01-26 17:01:31,878 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. =======================>label2id {'None': 0, 'No': 1, 'Novel': 2} =======================>positive_label =======================>use_balanced_neg False =======================>max_neg_scale 2 [INFO|configuration_utils.py:658] 2023-01-26 17:01:31,880 >> loading configuration file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/config.json [INFO|configuration_utils.py:712] 2023-01-26 17:01:31,880 >> Model config BertConfig { "_name_or_path": "microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract", "architectures": [ "BertForMaskedLM" ], "attention_probs_dropout_prob": 0.1, "classifier_dropout": null, "finetuning_task": "text-classification", "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "id2label": { "0": "None", "1": "No", "2": "Novel" }, "initializer_range": 0.02, "intermediate_size": 3072, "label2id": { "No": 1, "None": 0, "Novel": 2 }, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "bert", "num_attention_heads": 12, "num_hidden_layers": 12, "pad_token_id": 0, "position_embedding_type": "absolute", "transformers_version": "4.26.0", "type_vocab_size": 2, "use_cache": true, "vocab_size": 30522 }

[INFO|modeling_tf_utils.py:2694] 2023-01-26 17:01:31,895 >> loading weights file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/pytorch_model.bin [INFO|modeling_tf_pytorch_utils.py:168] 2023-01-26 17:01:31,963 >> Loading PyTorch weights from /Users/berkekavak/boun/biored/biored_re/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/pytorch_model.bin Traceback (most recent call last): File "/Users/berkekavak/boun/biored/biored_re/src/run_biored_exp.py", line 795, in main() File "/Users/berkekavak/boun/biored/biored_re/src/run_biored_exp.py", line 620, in main model = TFAutoModelForSequenceClassification.from_pretrained( File "/Users/berkekavak/miniforge3/envs/tensorflow/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py", line 464, in from_pretrained return model_class.from_pretrained( File "/Users/berkekavak/miniforge3/envs/tensorflow/lib/python3.9/site-packages/transformers/modeling_tf_utils.py", line 2744, in from_pretrained return load_pytorch_checkpoint_in_tf2_model( File "/Users/berkekavak/miniforge3/envs/tensorflow/lib/python3.9/site-packages/transformers/modeling_tf_pytorch_utils.py", line 169, in load_pytorch_checkpoint_in_tf2_model pt_state_dict.update(torch.load(pt_path, map_location="cpu")) File "/Users/berkekavak/miniforge3/envs/tensorflow/lib/python3.9/site-packages/torch/serialization.py", line 795, in load return _legacy_load(opened_file, map_location, pickle_module, pickle_load_args) File "/Users/berkekavak/miniforge3/envs/tensorflow/lib/python3.9/site-packages/torch/serialization.py", line 1002, in _legacy_load magic_number = pickle_module.load(f, pickle_load_args) _pickle.UnpicklingError: invalid load key, 'v'. cp: out_model_biored_novelty/test_results.tsv: No such file or directory Traceback (most recent call last): File "/Users/berkekavak/boun/biored/biored_re/src/utils/run_biored_eval.py", line 917, in run_test_eval( File "/Users/berkekavak/boun/biored/biored_re/src/utils/run_biored_eval.py", line 879, in run_test_eval dump_pred_2_pubtator_file(in_gold_pubtator_file = in_gold_pubtator_file, File "/Users/berkekavak/boun/biored/biored_re/src/utils/run_biored_eval.py", line 189, in dump_pred_2_pubtator_file pmids = sorted(list(pmid_2_rel_pairs_dict.keys()), reverse=True) AttributeError: 'NoneType' object has no attribute 'keys' datasets/biored/BioRED/Test.PubTator biored_pred_mul.txt datasets/biored/BioRED/Test.PubTator biored_pred_mul.txt datasets/biored/BioRED/Test.PubTator biored_pred_mul.txt datasets/biored/BioRED/Test.PubTator biored_pred_mul.txt

berkekavak commented 1 year ago

Hi,

I am able to run example codes of Huggingface, hope you returned safely.

Could you please send me the python 3.9 version of biored?

Best, Berke.

On Thu, Jan 26, 2023 at 5:05 PM BERKE KAVAK @.***> wrote:

The biored code currently available at the repo has some control characters issues and needs a slight modification. I copied the script (run_biored_exp.sh) into the biored_re directory (instead of biored_re/scripts) and then executed the code by:

bash run_biored_exp.sh 0 (for my Mac)

I also tested the example code that you sent. Working on this issue for almost 3 weeks. I wish we could schedule a short zoom session. Or I can try the new biored code for python 3.9. After that, I can contact you again.

I wish you a safe journey. Many thanks for your answers.

On Wed, Jan 25, 2023 at 8:35 PM ptlai @.***> wrote:

Hi @berkekavak https://github.com/berkekavak ,

I tested the requirements.txt that you mentioned, but I used python 3.6.15.

I tried to execute the code: 1) on a Windows (WSL Ubuntu 20.04 LTS) 2) on a Macbook Pro M1 3) on an Intel Mac (2017) I got similar errors on those devices, with and without using a GPU. Which is about the labels. I guess the problem is related to the pretrained packages. Latest PubMedBERT and compatibility issues of this model might be the problem. I tried to Find the python versions (requirements.txt) of my environment attached. Best, Berke. On Wed, Jan 25, 2023 at 3:31 AM ptlai @.> wrote: Hi @berkekavak https://github.com/berkekavak https://github.com/berkekavak https://github.com/berkekavak , Thanks. There is another problem I found. Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory You can try the following commands if you are unable to access the GPU. conda install -c conda-forge cudatoolkit=11.1 conda install -c conda-forge cudnn=8.2.1 However, the GPU error does not appear to be the cause of the below error. Traceback (most recent call last): File "src/run_biored_exp.py", line 795, in main() File "src/run_biored_exp.py", line 624, in main cache_dir = model_args.cache_dir, File "/home/berkekavak/miniconda3/envs/species/lib/python3.6/site-packages/transformers/models/auto/auto_factory.py", line 446, in from_pretrained Could you please share the Python packages you have installed? Thank you. — Reply to this email directly, view it on GitHub <#1 (comment) https://github.com/ncbi/BioRED/issues/1#issuecomment-1402879249>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALTKNCZ62EKOB4M3MUW62ULWUBX73ANCNFSM6AAAAAAQMGNVRU https://github.com/notifications/unsubscribe-auth/ALTKNCZ62EKOB4M3MUW62ULWUBX73ANCNFSM6AAAAAAQMGNVRU . You are receiving this because you were mentioned.Message ID: @.> absl-py==0.15.0 accelerate==0.9.0 aiohttp==3.8.3 aiosignal==1.2.0 astunparse==1.6.3 async-timeout==4.0.2 asynctest==0.13.0 attrs==22.2.0 awscli==1.24.10 blis==0.7.9 botocore==1.26.10 cached-property==1.5.2 cachetools==4.2.4 catalogue==2.0.8 certifi==2021.5.30 charset-normalizer==2.0.12 clang==5.0 click==8.0.4 colorama==0.4.4 conllu==4.5.2 contextvars==2.4 cymem==2.0.7 dataclasses==0.8 datasets==2.3.2 dill==0.3.4 docutils==0.16 en-core-sci-md @ https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.5.0/en_core_sci_md-0.5.0.tar.gz filelock==3.4.1 flatbuffers==1.12 frozenlist==1.2.0 fsspec==2022.1.0 gast==0.4.0 google-auth==1.35.0 google-auth-oauthlib==0.4.6 google-pasta==0.2.0 grpcio==1.48.2 h5py==3.1.0 huggingface-hub==0.4.0 idna==3.4 idna-ssl==1.1.0 immutables==0.19 importlib-metadata==4.8.3 importlib-resources==5.4.0 Jinja2==3.0.3 jmespath==0.10.0 joblib==1.1.1 keras==2.6.0 Keras-Preprocessing==1.1.2 langcodes==3.3.0 Markdown==3.3.7 MarkupSafe==2.0.1 multidict==5.2.0 multiprocess==0.70.12.2 murmurhash==1.0.9 nmslib==2.1.1 numpy==1.19.5 oauthlib==3.2.2 opt-einsum==3.3.0 packaging==21.3 pandas==1.1.5 pathy==0.10.1 preshed==3.0.8 protobuf==3.19.4 psutil==5.9.4 pyarrow==6.0.1 pyasn1==0.4.8 pyasn1-modules==0.2.8 pybind11==2.6.1 pydantic==1.8.2 pyparsing==3.0.9 pysbd==0.3.4 python-dateutil==2.8.2 pytz==2022.7 PyYAML==5.4.1 regex==2022.10.31 requests==2.27.1 requests-oauthlib==1.3.1 responses==0.17.0 rsa==4.7.2 s3transfer==0.5.2 sacremoses==0.0.53 scikit-learn==0.24.2 scipy==1.5.4 scispacy==0.2.4 sentencepiece==0.1.97 six==1.15.0 smart-open==6.3.0 spacy==3.2.4 spacy-legacy==3.0.11 spacy-loggers==1.0.4 srsly==2.4.5 tensorboard==2.6.0 tensorboard-data-server==0.6.1 tensorboard-plugin-wit==1.8.1 tensorflow==2.6.2 tensorflow-estimator==2.6.0 tensorflow-gpu==2.6.2 termcolor==1.1.0 thinc==8.0.17 threadpoolctl==3.1.0 tokenizers==0.12.1 torch==1.8.0 tqdm==4.64.1 transformers==4.18.0 typer==0.4.2 typing-extensions==3.7.4.3 urllib3==1.26.14 wasabi==0.10.1 Werkzeug==2.0.3 wrapt==1.12.1 xxhash==3.2.0 yarl==1.7.2 zipp==3.6.0

However, as I mentioned earlier, it works on our server. The error may be caused by cuda or GPU driver rather than requirements.txt, but I am not sure. I would appreciate it if you could also try the hugginggface transformer's sample code, and let me know if it works. I am traveling now, and I will return next Monday. After that, I should be able to send you the python 3.9 version of biored_re.

— Reply to this email directly, view it on GitHub https://github.com/ncbi/BioRED/issues/1#issuecomment-1403988365, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALTKNC6N3FOBYQTSNUIHVNLWUFP5RANCNFSM6AAAAAAQMGNVRU . You are receiving this because you were mentioned.Message ID: @.***>

ptlai commented 1 year ago

Hi @berkekavak ,

Thank you and sorry for the late reply. Please find the attached file, and let me know if you still have the same problem. Thanks! biored_re_py39.zip

Best, Po-Ting

berkekavak commented 1 year ago

Hi Po-Ting,

I am very happy that you sent the new version. Thank you for your great help. Yesterday I also tried this with an Intel based Mac. Got the exact same error message as attached (TFAutoModelForSequenceClassification this error is the exact same as the previous version of biored). Pytorch weights somehow cannot be loaded.

I download the model by creating a microsoft directory and under this directory: git clone https://huggingface.co/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract

I am not sure where the problem is but can you please send me your whole directory (with the PubMedBERT model)? Maybe then it might work.

Sincerely, Berke.

On Tue, Jan 31, 2023 at 8:10 AM ptlai @.***> wrote:

Hi @berkekavak https://github.com/berkekavak ,

Thank you and sorry for the late reply. Please find the attached file, and let me know if you still have the same problem. Thanks! biored_re_py39.zip https://github.com/ncbi/BioRED/files/10542976/biored_re_py39.zip

Best, Po-Ting

— Reply to this email directly, view it on GitHub https://github.com/ncbi/BioRED/issues/1#issuecomment-1409768069, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALTKNCYUHWIC3JBUTPR6A2DWVCNDNANCNFSM6AAAAAAQMGNVRU . You are receiving this because you were mentioned.Message ID: @.***>

INFO|modeling_tf_utils.py:1776] 2023-02-03 12:35:49,344 >> loading weights file microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/pytorch_model.bin [INFO|modeling_tf_pytorch_utils.py:119] 2023-02-03 12:35:49,429 >> Loading PyTorch weights from /mnt/c/Users/berke/Documents/boun/biored/biored_re_py39/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract/pytorch_model.bin Traceback (most recent call last): File "/mnt/c/Users/berke/Documents/boun/biored/biored_re_py39/src/run_biored_exp.py", line 795, in main() File "/mnt/c/Users/berke/Documents/boun/biored/biored_re_py39/src/run_biored_exp.py", line 620, in main model = TFAutoModelForSequenceClassification.from_pretrained( File "/home/berkekavak/miniconda3/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py", line 446, in from_pretrained return model_class.from_pretrained(pretrained_model_name_or_path, *model_args, config=config, kwargs) File "/home/berkekavak/miniconda3/lib/python3.9/site-packages/transformers/modeling_tf_utils.py", line 1796, in from_pretrained return load_pytorch_checkpoint_in_tf2_model(model, resolved_archive_file, allow_missing_keys=True) File "/home/berkekavak/miniconda3/lib/python3.9/site-packages/transformers/modeling_tf_pytorch_utils.py", line 121, in load_pytorch_checkpoint_in_tf2_model pt_state_dict = torch.load(pt_path, map_location="cpu") File "/home/berkekavak/miniconda3/lib/python3.9/site-packages/torch/serialization.py", line 795, in load return _legacy_load(opened_file, map_location, pickle_module, pickle_load_args) File "/home/berkekavak/miniconda3/lib/python3.9/site-packages/torch/serialization.py", line 1002, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_args) _pickle.UnpicklingError: invalid load key, 'v'. cp: cannot stat 'out_model_biored_novelty/test_results.tsv': No such file or directory Traceback (most recent call last): File "/mnt/c/Users/berke/Documents/boun/biored/biored_re_py39/src/utils/run_biored_eval.py", line 917, in run_test_eval( File "/mnt/c/Users/berke/Documents/boun/biored/biored_re_py39/src/utils/run_biored_eval.py", line 879, in run_test_eval dump_pred_2_pubtator_file(in_gold_pubtator_file = in_gold_pubtator_file, File "/mnt/c/Users/berke/Documents/boun/biored/biored_re_py39/src/utils/run_biored_eval.py", line 189, in dump_pred_2_pubtator_file pmids = sorted(list(pmid_2_rel_pairs_dict.keys()), reverse=True) AttributeError: 'NoneType' object has no attribute 'keys' datasets/biored/BioRED/Test.PubTator biored_pred_mul.txt datasets/biored/BioRED/Test.PubTator biored_pred_mul.txt datasets/biored/BioRED/Test.PubTator biored_pred_mul.txt datasets/biored/BioRED/Test.PubTator biored_pred_mul.txt

ptlai commented 1 year ago

Hi @berkekavak ,

Thank you! Could you please send me your email address? I will send you the link. BTW, I recommend you try GCloud or a regular Linux server if you are able, as the code has not been tested on Mac.

Best, Po-Ting

berkekavak commented 1 year ago

@.***

Thanks, Berke.

On 3 Feb 2023 Fri at 17:22 ptlai @.***> wrote:

Hi @berkekavak https://github.com/berkekavak ,

Thank you! Could you please send me your email address? I will send you the link. BTW, I recommend you try GCloud or a regular Linux server if you are able, as the code has not been tested on Mac.

Best, Po-Ting

— Reply to this email directly, view it on GitHub https://github.com/ncbi/BioRED/issues/1#issuecomment-1415938457, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALTKNC2PXYCWTET2KBJOUYDWVUIB7ANCNFSM6AAAAAAQMGNVRU . You are receiving this because you were mentioned.Message ID: @.***>

--

Berke KavakIndustrial Engineering & Economics

ptlai commented 1 year ago

Hello @berkekavak , Your email address is not visible to me; could you please send it to laip2@nih.gov again? Thank you!

ptlai commented 1 year ago

I recenlty updated our script to enable users to utilize our pre-trained model for predicting new data in pubtator format. Instructions can be found in the README file at https://ftp.ncbi.nlm.nih.gov/pub/lu/BioRED/biored_re_source_code.tar.

Regarding a previous email, the question raised by @berkekavak has been modified to address NER in BioRED. Assuming no further questions, it is understood that Ling has resolved the issue regarding the use of AIONER in BioRED.

Khyati-Microcrispr commented 4 months ago

hi, Im trying to run bash scripts/run_test_pred.sh 0 -------- I am getting this error, I used transformers == 4.18.0 accelerate == 0.9.0 pandas == 1.1.5 numpy == 1.19.5 datasets == 2.3.2 sentencepiece != 0.1.92 protobuf == 3.19.4 spacy == 3.2.4 scispacy == 0.2.4 tensorflow-gpu == 2.6.2

ERROR Converting the dataset into BioRED-RE input format Traceback (most recent call last): File "/home/microcrispr9/Downloads/biored_re_source_code/src/dataset_format_converter/convert_pubtator_2_bert.py", line 14, in import spacy File "/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/site-packages/spacy/init.py", line 11, in from thinc.api import prefer_gpu, require_gpu, require_cpu # noqa: F401 File "/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/site-packages/thinc/api.py", line 2, in from .initializers import normal_init, uniform_init, glorot_uniform_init, zero_init File "/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/site-packages/thinc/initializers.py", line 4, in from .backends import Ops File "/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/site-packages/thinc/backends/init.py", line 10, in from ._cupy_allocators import cupy_tensorflow_allocator, cupy_pytorch_allocator File "/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/site-packages/thinc/backends/_cupy_allocators.py", line 12, in import torch File "/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/site-packages/torch/init.py", line 465, in for name in dir(_C): NameError: name '_C' is not defined Generating RE and novelty predictions Traceback (most recent call last): File "/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 857, in _get_module return importlib.import_module("." + module_name, self.name) File "/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1030, in _gcd_import File "", line 1007, in _find_and_load File "", line 986, in _find_and_load_unlocked File "", line 680, in _load_unlocked File "", line 850, in exec_module File "", line 228, in _call_with_frames_removed File "/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/site-packages/transformers/trainer_utils.py", line 43, in import torch File "/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/site-packages/torch/init.py", line 1429, in from torch import optim as optim File "/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/site-packages/torch/optim/init.py", line 8, in from .adadelta import Adadelta File "/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/site-packages/torch/optim/adadelta.py", line 4, in from .optimizer import (Optimizer, _use_grad_for_differentiable, _default_to_fused_or_foreach, File "/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/site-packages/torch/optim/optimizer.py", line 23, in from typing_extensions import ParamSpec, Self, TypeAlias ImportError: cannot import name 'ParamSpec' from 'typing_extensions' (/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/site-packages/typing_extensions.py)

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/microcrispr9/Downloads/biored_re_source_code/src/run_biored_exp.py", line 35, in from transformers import ( File "", line 1055, in _handle_fromlist File "/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 847, in getattr module = self._get_module(self._class_to_module[name]) File "/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 859, in _get_module raise RuntimeError( RuntimeError: Failed to import transformers.trainer_utils because of the following error (look up to see its traceback): cannot import name 'ParamSpec' from 'typing_extensions' (/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/site-packages/typing_extensions.py) cp: cannot stat 'out_model_biored_all_mul/test_results.tsv': No such file or directory Traceback (most recent call last): File "/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 857, in _get_module return importlib.import_module("." + module_name, self.name) File "/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1030, in _gcd_import File "", line 1007, in _find_and_load File "", line 986, in _find_and_load_unlocked File "", line 680, in _load_unlocked File "", line 850, in exec_module File "", line 228, in _call_with_frames_removed File "/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/site-packages/transformers/trainer_utils.py", line 43, in import torch File "/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/site-packages/torch/init.py", line 1429, in from torch import optim as optim File "/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/site-packages/torch/optim/init.py", line 8, in from .adadelta import Adadelta File "/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/site-packages/torch/optim/adadelta.py", line 4, in from .optimizer import (Optimizer, _use_grad_for_differentiable, _default_to_fused_or_foreach, File "/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/site-packages/torch/optim/optimizer.py", line 23, in from typing_extensions import ParamSpec, Self, TypeAlias ImportError: cannot import name 'ParamSpec' from 'typing_extensions' (/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/site-packages/typing_extensions.py)

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/microcrispr9/Downloads/biored_re_source_code/src/run_biored_exp.py", line 35, in from transformers import ( File "", line 1055, in _handle_fromlist File "/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 847, in getattr module = self._get_module(self._class_to_module[name]) File "/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 859, in _get_module raise RuntimeError( RuntimeError: Failed to import transformers.trainer_utils because of the following error (look up to see its traceback): cannot import name 'ParamSpec' from 'typing_extensions' (/home/microcrispr9/anaconda3/envs/biored_re/lib/python3.9/site-packages/typing_extensions.py) cp: cannot stat 'out_model_biored_novelty/test_results.tsv': No such file or directory Generating PubTator file Traceback (most recent call last): File "/home/microcrispr9/Downloads/biored_re_source_code/src/utils/run_biored_eval.py", line 910, in dump_pred_2_pubtator_file(in_test_pubtator_file = in_test_pubtator_file, File "/home/microcrispr9/Downloads/biored_re_source_code/src/utils/run_biored_eval.py", line 197, in dump_pred_2_pubtator_file pmids = sorted(list(pmid_2_rel_pairs_dict.keys()), reverse=True) AttributeError: 'NoneType' object has no attribute 'keys'

KINDLY HELP RUN THESE FILES.