Open Joerg99 opened 5 years ago
Ok, so the prediciton mode is working? I wondered because if I start it from the script with
--do_train=False \ --do_predict=True \ --do_eval=False \
this line causes an error.
--do_train=False \ --do_predict=True \ --do_eval=True \
will work
hi but if i want to test performance on new data with no labels, is it possible? going through your code i made some changes to exclude label_ids from being used in prediction but haven't succeeded yet. Any help would be appreciated!
PS : Also why are you passing label_ids as a feature?
@anupamsingh610 I wanted to do the same thing as you. After making changes to the exception it raises in the main function, and making a dummy list of label_ids labels right before the label_ids list is passed to InputFeatures in the convert_single_example function, the script will run with only do_eval as False and do_predict as True, and will print out predicted labels for each token in an output file. It took me a little tweaking to get to work rght, but that's the gist of it. Providing a dummy list will not change the output of the trained model if you are not training.
To answer your PS, the label_ids are passed as a feature because that is the way the class is implemented in the original google script. By passing the label_ids in the InputFeatures class, the model is able to access these during training.
@saverymax Thanks for the reply! I figured the same and provided [CLS] as the dummy label_ids. Also I removed the assertion of having at least one of do_train
and do_eval
flag to be set to True
and now it runs fine with just do_predict
set to True
.
And regarding PS I still think its not intuitive :)
The predict operation was allowd. Reference here: https://github.com/kyzhouhzau/BERT-NER/blob/d190281df54263b84cedde570a5b7a90019538b7/BERT_NER.py#L659