Closed rhdunn closed 10 months ago
ERROR: Sentence answers-20111107175720AAlb2TB_ans-0015 token 17 -- RB lemma 'basically' is not the lowercase form 'basically' text
This word contains a special character (a soft hyphen) which was presumably inserted by software: https://github.com/UniversalDependencies/UD_English-EWT/issues/83#issuecomment-1003266767
ERROR: Sentence reviews-037794-0003 token 1 -- RB lemma 'definitely' is not the lowercase form 'Def' text
ERROR: Sentence reviews-018548-0002 token 5 -- RB lemma 'definitely' is not the lowercase form 'deffly' text
ERROR: Sentence reviews-018548-0004 token 12 -- RB lemma 'probably' is not the lowercase form 'prolly' text
These are Abbr=Yes
. I think I'll add Style=Slng
(arguably the speaker is trying to sound "hip") and CorrectForm
.
ERROR: Sentence reviews-042012-0006 token 3 -- RB lemma 'forever' is not the lowercase form '4-ever' text
Does this deserve some sort of Style
? I can't tell if it's meant to be cutesy (Style=Expr
) or is just an abbreviation.
ERROR: Sentence newsgroup-groups.google.com_RagnarokOnlineII_5730bc7888fcee99_ENG_20051122_035600-0003 token 20 -- RB lemma 'pretty' is not the lowercase form 'preety' text
Made this Style=Expr
though it could just be a typo.
I'm happy for 4-ever
to remain an Abbr=Yes
. That case is missing a CorrectForm
annotation, so my validator does not know it is an abbreviation for "forever". -- Other abbrevaiations that are not initialisms (BRB, PS, AD, etc.) have a CorrectForm annotation, e.g. "Sept.".
The following are missing
Style
andCorrectForm
annotations, flagged by RB lemma validation checks: