I notice that this treebank does not have annotations for the original whitespace (i.e. SpaceAfter=No fields).
It looks like the LDC distributions of ATB contain the original text that the treebank is based on, and there are a few cases (mostly related to punctuation and numbers) where the text doesn't put any whitespace between treebank tokens.
In case anyone is interested, I wrote a script to add whitespace information to the CONLL-U files based on the original text as distributed by LDC.
I notice that this treebank does not have annotations for the original whitespace (i.e.
SpaceAfter=No
fields).It looks like the LDC distributions of ATB contain the original text that the treebank is based on, and there are a few cases (mostly related to punctuation and numbers) where the text doesn't put any whitespace between treebank tokens.
In case anyone is interested, I wrote a script to add whitespace information to the CONLL-U files based on the original text as distributed by LDC.