Open scherbakovdmitri opened 10 months ago
The predictions.json file should be generated by mSpERT. If it is missing, there is likely an error occurring in the model training or inference. I see you encountered the error “Failed to load image Python extension” earlier in the run. I have not encountered this error before. My 2-minute troubleshooting makes me think this is a pytorch or CUDA issue.
Best regards,
Kevin Lybarger, Ph.D. Assistant Professor Department of Information Sciences and Technology College of Engineering & Computing George Mason University @.**@.%E2%80%8Bedu> https://www.kevinlybarger.me/
From: scherbakovdmitri @.> Sent: Wednesday, November 15, 2023 5:28 PM To: Lybarger/sdoh_extraction @.> Cc: Subscribed @.***> Subject: [Lybarger/sdoh_extraction] predictions.json file question (Issue #1)
Hello, I am testing predict mode, but I am not sure what is predictions.json file which is missing, where can I get it from? I placed two .txt files in the data folder and ran this command (Python 3.8.17)
python infer_mspert.py --source_dir /sdoh_extraction/data/ --destination /sdoh_extraction/output/predict --mspert_path /mspert --mode predict --model_path /mspert_sdoh_model/
Text import: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 268.30it/s]
WARNING:root:events2spert
OrderedDict([('model_path', '/mspert_sdoh_model/'), ('tokenizer_path', '/mspert_sdoh_model/'), ('eval_batch_size', 2), ('rel_filter_threshold', 0.5), ('max_span_size', 10), ('store_predictions', True), ('store_examples', True), ('sampling_processes', 4), ('max_pairs', 1000), ('no_overlapping', True), ('device', 0), ('dataset_path', '/sdoh_extraction/output/predict/data_eval.json'), ('log_path', '/sdoh_extraction/output/predict/log/')])
/sdoh_extraction/output/predict/config.conf
/usr/local/lib/python3.8/site-packages/torchvision/io/image.py:11: UserWarning: Failed to load image Python extension:
warn(f"Failed to load image Python extension: {e}")
Config:
{'model_path': '/mspert_sdoh_model/', 'tokenizer_path': '/mspert_sdoh_model/', 'eval_batch_size': '2', 'rel_filter_threshold': '0.5', 'max_span_size': '10', 'store_predictions': 'True', 'store_examples': 'True', 'sampling_processes': '4', 'max_pairs': '1000', 'no_overlapping': 'True', 'device': '0', 'dataset_path': '/sdoh_extraction/output/predict/data_eval.json', 'log_path': '/sdoh_extraction/output/predict/log/'}
Repeat 1 times
Iteration 0
/usr/local/lib/python3.8/site-packages/torchvision/io/image.py:11: UserWarning: Failed to load image Python extension:
warn(f"Failed to load image Python extension: {e}")
2023-11-15 22:23:17,387 [MainThread ] [INFO ]
2023-11-15 22:23:17,387 [MainThread ] [INFO ] --------------------------------------------------------------------------------
2023-11-15 22:23:17,387 [MainThread ] [INFO ] Loading saved config
2023-11-15 22:23:17,387 [MainThread ] [INFO ] --------------------------------------------------------------------------------
2023-11-15 22:23:17,388 [MainThread ] [INFO ] Configuration files found:
2023-11-15 22:23:17,389 [MainThread ] [INFO ] config_path: /mspert_sdoh_model/spert_args.json
2023-11-15 22:23:17,389 [MainThread ] [INFO ] types_path: /mspert_sdoh_model/spert_types.json
2023-11-15 22:23:17,389 [MainThread ] [INFO ] Arguments defined at input:
2023-11-15 22:23:17,389 [MainThread ] [INFO ] dataset_path = /sdoh_extraction/output/predict/data_eval.json
2023-11-15 22:23:17,389 [MainThread ] [INFO ] config = /sdoh_extraction/output/predict/config.conf
2023-11-15 22:23:17,389 [MainThread ] [INFO ] model_path = /mspert_sdoh_model/
2023-11-15 22:23:17,389 [MainThread ] [INFO ] tokenizer_path = /mspert_sdoh_model/
2023-11-15 22:23:17,389 [MainThread ] [INFO ] max_span_size = 10
2023-11-15 22:23:17,389 [MainThread ] [INFO ] sampling_processes = 4
2023-11-15 22:23:17,389 [MainThread ] [INFO ] cpu = False
2023-11-15 22:23:17,389 [MainThread ] [INFO ] eval_batch_size = 2
2023-11-15 22:23:17,389 [MainThread ] [INFO ] max_pairs = 1000
2023-11-15 22:23:17,389 [MainThread ] [INFO ] rel_filter_threshold = 0.5
2023-11-15 22:23:17,389 [MainThread ] [INFO ] no_overlapping = True
2023-11-15 22:23:17,389 [MainThread ] [INFO ] device = 0
2023-11-15 22:23:17,389 [MainThread ] [INFO ] seed = None
2023-11-15 22:23:17,389 [MainThread ] [INFO ] cache_path = None
2023-11-15 22:23:17,389 [MainThread ] [INFO ] debug = False
2023-11-15 22:23:17,389 [MainThread ] [INFO ] label = None
2023-11-15 22:23:17,389 [MainThread ] [INFO ] log_path = /sdoh_extraction/output/predict/log/
2023-11-15 22:23:17,389 [MainThread ] [INFO ] store_predictions = True
2023-11-15 22:23:17,389 [MainThread ] [INFO ] store_examples = True
2023-11-15 22:23:17,389 [MainThread ] [INFO ] example_count = None
2023-11-15 22:23:17,389 [MainThread ] [INFO ] Arguments defined from saved config:
2023-11-15 22:23:17,389 [MainThread ] [INFO ] (not all are relevant to evaluation or prediction)
2023-11-15 22:23:17,389 [MainThread ] [INFO ] train_path = /saves/save/data_train.json
2023-11-15 22:23:17,389 [MainThread ] [INFO ] valid_path = /saves/save/data_valid.json
2023-11-15 22:23:17,389 [MainThread ] [INFO ] save_path = /saves/save
2023-11-15 22:23:17,389 [MainThread ] [INFO ] init_eval = False
2023-11-15 22:23:17,389 [MainThread ] [INFO ] save_optimizer = False
2023-11-15 22:23:17,389 [MainThread ] [INFO ] train_log_iter = 100
2023-11-15 22:23:17,389 [MainThread ] [INFO ] final_eval = True
2023-11-15 22:23:17,389 [MainThread ] [INFO ] train_batch_size = 15
2023-11-15 22:23:17,389 [MainThread ] [INFO ] epochs = 14
2023-11-15 22:23:17,389 [MainThread ] [INFO ] neg_entity_count = 100
2023-11-15 22:23:17,389 [MainThread ] [INFO ] neg_relation_count = 100
2023-11-15 22:23:17,390 [MainThread ] [INFO ] lr = 5e-05
2023-11-15 22:23:17,390 [MainThread ] [INFO ] lr_warmup = 0.1
2023-11-15 22:23:17,390 [MainThread ] [INFO ] weight_decay = 0.01
2023-11-15 22:23:17,390 [MainThread ] [INFO ] max_grad_norm = 1.0
2023-11-15 22:23:17,390 [MainThread ] [INFO ] model_type = spert
2023-11-15 22:23:17,390 [MainThread ] [INFO ] lowercase = False
2023-11-15 22:23:17,390 [MainThread ] [INFO ] types_path = /saves/save/spert_types.json
2023-11-15 22:23:17,390 [MainThread ] [INFO ] size_embedding = 25
2023-11-15 22:23:17,390 [MainThread ] [INFO ] subtype_classification = concat_logits
2023-11-15 22:23:17,390 [MainThread ] [INFO ] include_sent_task = False
2023-11-15 22:23:17,390 [MainThread ] [INFO ] include_word_piece_task = False
2023-11-15 22:23:17,390 [MainThread ] [INFO ] concat_sent_pred = False
2023-11-15 22:23:17,390 [MainThread ] [INFO ] concat_word_piece_logits = False
2023-11-15 22:23:17,390 [MainThread ] [INFO ] include_adjacent = False
2023-11-15 22:23:17,390 [MainThread ] [INFO ] prop_drop = 0.2
2023-11-15 22:23:17,390 [MainThread ] [INFO ] freeze_transformer = False
2023-11-15 22:23:17,390 [MainThread ] [INFO ] Arguments overwritten:
2023-11-15 22:23:17,390 [MainThread ] [INFO ] types_path = /mspert_sdoh_model/spert_types.json
2023-11-15 22:23:17,390 [MainThread ] [INFO ] --------------------------------------------------------------------------------
2023-11-15 22:23:17,442 [MainThread ] [INFO ] Dataset: /sdoh_extraction/output/predict/data_eval.json
2023-11-15 22:23:17,442 [MainThread ] [INFO ] Model: spert
Parse dataset 'test': 0%| | 0/2 [00:00<?, ?it/s]
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/threading.py", line 932, in _bootstrap_inner
Process SpawnProcess-1:
self.run()
File "/usr/local/lib/python3.8/site-packages/tensorboardX/event_file_writer.py", line 202, in run
data = self._queue.get(True, queue_wait_duration)
File "/usr/local/lib/python3.8/multiprocessing/queues.py", line 111, in get
res = self._recv_bytes()
File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 216, in recv_bytes
buf = self._recv_bytes(maxlength)
File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes
buf = self._recv(4)
File "/usr/local/lib/python3.8/multiprocessing/connection.py", line 383, in _recv
raise EOFError
EOFError
Traceback (most recent call last):
File "/usr/local/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/usr/local/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/mspert/spert.py", line 29, in __eval
trainer.eval(dataset_path=run_args.dataset_path, input_reader_cls=input_reader.JsonInputReader)
File "/mspert/spert/spert_trainer.py", line 202, in eval
test_dataset = input_reader.read(dataset_path, dataset_label)
File "/mspert/spert/input_reader.py", line 226, in read
self._parse_dataset(dataset_path, dataset)
File "/mspert/spert/input_reader.py", line 234, in _parse_dataset
self._parse_document(document, dataset)
File "/mspert/spert/input_reader.py", line 262, in _parse_document
document = dataset.create_document( \
TypeError: create_document() missing 1 required positional argument: 'doc_id'
out 0
Traceback (most recent call last):
File "infer_mspert.py", line 259, in
sys.exit(main(args))
File "infer_mspert.py", line 167, in main
merge_spert_files(model_config["dataset_path"], predict_file, merged_file)
File "/sdoh_extraction/spert_utils/spert_io.py", line 1206, in merge_spert_files
predict = json.load(open(predict_file, 'r'))
FileNotFoundError: [Errno 2] No such file or directory: '/sdoh_extraction/output/predict/log/predictions.json'```
— Reply to this email directly, view it on GitHubhttps://github.com/Lybarger/sdoh_extraction/issues/1, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACPVTB45PM5OB4TWKZRJYITYEU6V7AVCNFSM6AAAAAA7NGQVR2VHI2DSMVQWIX3LMV43ASLTON2WKOZRHE4TKNRXGM4DSNY. You are receiving this because you are subscribed to this thread.Message ID: @.**@.>>
The predictions.json file should be generated by mSpERT. If it is missing, there is likely an error occurring in the model training or inference. I see you encountered the error “Failed to load image Python extension” earlier in the run. I have not encountered this error before. My 2-minute troubleshooting makes me think this is a pytorch or CUDA issue. Best regards, Kevin Lybarger, Ph.D. Assistant Professor Department of Information Sciences and Technology College of Engineering & Computing George Mason University …
Thank you Dr. Lybarger, CUDA is surely issue since I am running this in container on Mac without CUDA support. But I thought I could still run prediction without CUDA. I will check again to see why predictions.json is not generated, I am using already trained model, so maybe CUDA is the issue here.
One other idea - the Failed to load image Python extension
appears to be coming from TorchVision, which we don't use to my knowledge. Not sure but it may be benign.
I also noticed the TypeError: create_document() missing 1 required positional argument: 'doc_id'
, which I suspect may be the true culprit and result in the predictions.json
file not being produced.
Thanks, Nic, I will get access to CUDA machine soon and test it, I am not sure why predictions.json is not generated. Maybe I need to reinstall mSpERT again and then see if I can apply some minimal example (probably SpERT repo should have some) to see if it generates predictions.json. I will let you know how test on CUDA machine goes next week. Thanks.
Hello, I am testing predict mode, but I am not sure what is predictions.json file which is missing, where can I get it from? I placed two .txt files in the data folder and ran this command (Python 3.8.17)
python infer_mspert.py --source_dir /sdoh_extraction/data/ --destination /sdoh_extraction/output/predict --mspert_path /mspert --mode predict --model_path /mspert_sdoh_model/