Closed danieldk closed 4 years ago
When removing the POS tag or the head of a conll token, the error message I get in training is
Cannot read batch: cannot parse as integer field: DET
where DET is the dependency relation of the token. Am I missing something in my test case?
When removing the POS tag or the head of a conll token, the error message I get in training is
Cannot read batch: cannot parse as integer field: DET
where DET is the dependency relation of the token. Am I missing something in my test case?
Did you replace the POS tag by an underscore (_
). This error suggests that the fields have shifted and that the dependency relation is now in the head index column.
This error suggests that the fields have shifted and that the dependency relation is now in the head index column.
Yes, that is exactly what happens.
Did you replace the POS tag by an underscore (
_
).
I tried that as well - replace either the POS or the head by an underscore - but then the program simply finishes without having processed the sentence. So there is not even an error message indicating wrong input.
On Fri, Nov 15, 2019, at 08:21, Patricia Fischer wrote:
This error suggests that the fields have shifted and that the dependency relation is now in the head index column.
Yes, that is exactly what happens.
Did you replace the POS tag by an underscore (
_
).I tried that as well - replace either the POS or the head by an underscore - but then the program simply finishes without having processed the sentence. So there is not even an error message indicating wrong input.
Could you send that file (or perhaps just the file with that sentence)?
Not replacing by an underscore is definitely wrong, since it shifts the columns.
Could you send that file (or perhaps just the file with that sentence)? Not replacing by an underscore is definitely wrong, since it shifts the columns.
Thanks for the example! Note that you removed coarse-grained POS. Coarse-grained tags are not used at all. In this case you can also remove the fine-grained tag, since Diese is not a head of any token. Try removing the tag of *lösen' and you would get:
Cannot collect sentence: Head of token 'Probleme' does not have a part-of-speech:
Diese [ Probleme ] lösen Studenten schnell .
Still produces the plain error message
Error tagging sentences: Token without a tag: lösen
(tagging)
or
Cannot read batch: Token without a tag: lösen
(training)
even after checking I am on the right branch
Still produces the plain error message
Error tagging sentences: Token without a tag: lösen
(tagging) orCannot read batch: Token without a tag: lösen
(training) even after checking I am on the right branch
Do you use tag embeddings in your configuration? If so, then this is a different error. Since if you use tag embeddings, every token should have a tag.
The error messages now include the sentence for which processing failed, showing the token in brackets.