Closed bbardakk closed 3 years ago
Hi @EmilyAlsentzer,
I tried to extract features as you suggested but faced with a problem. When I run the original BERT example below everything works fine.
echo 'Who was Jim Henson ? ||| Jim Henson was a puppeteer' > /tmp/input.txt python extract_features.py \ --input_file=/tmp/input.txt \ --output_file=/tmp/output.jsonl \ --vocab_file=$BERT_BASE_DIR/vocab.txt \ --bert_config_file=$BERT_BASE_DIR/bert_config.json \ --init_checkpoint=$BERT_BASE_DIR/bert_model.ckpt \ --layers=-1,-2,-3,-4 \ --max_seq_length=128 \ --batch_size=8
I changed the bert_config_file and init_checkpoint part and run the below code.
python extract_features.py --input_file=/tmp/input.txt --output_file=/tmp/output.jsonl --vocab_file=bert_pretrain_output_all_notes_150000/vocab.txt --bert_config_file=bert_pretrain_output_all_notes_150000/bert_config.json --init_checkpoint=bert_pretrain_output_all_notes_150000/model.ckpt --layers=-1,-2,-3,-4 --max_seq_length=128
I took the error message below. I think that the problem is with the init_checkpoint part and I try different names like "model.ckpt", "model.ckpt-150000" ... but none of them work.
tensorflow.python.framework.errors_impl.NotFoundError: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for bert_pretrain_output_all_notes_150000/model.ckpt
So could you please help me to run ClinicalBert to extract features from clinical notes? Also is it possible to use ClinicalBert to extract embeddings of each word in clinical notes ? Thanks in advance.
Hello,
Did you figure out your error by any chance? Three are three files of checkpoints and not sure which one to use. Could you please guide if you can. Thank you.!!
This is a duplicate of issue #12. Please refer to that issue.
Hi @EmilyAlsentzer,
I tried to extract features as you suggested but faced with a problem. When I run the original BERT example below everything works fine.
I changed the bert_config_file and init_checkpoint part and run the below code.
python extract_features.py --input_file=/tmp/input.txt --output_file=/tmp/output.jsonl --vocab_file=bert_pretrain_output_all_notes_150000/vocab.txt --bert_config_file=bert_pretrain_output_all_notes_150000/bert_config.json --init_checkpoint=bert_pretrain_output_all_notes_150000/model.ckpt --layers=-1,-2,-3,-4 --max_seq_length=128
I took the error message below. I think that the problem is with the init_checkpoint part and I try different names like "model.ckpt", "model.ckpt-150000" ... but none of them work.
So could you please help me to run ClinicalBert to extract features from clinical notes? Also is it possible to use ClinicalBert to extract embeddings of each word in clinical notes ? Thanks in advance.