Open JeltevanBoheemen opened 1 year ago
The metadata can be added by using the function cleantext from the module cleanCHILDEStokens
def cleantext(utt: str, repkeep: bool, tokenoutput: bool = False) -> Tuple[CleanedText, Metadata]:
where utt is a string with CHAT-annotations (to be taken from the existing metadata associated with the parse tree), the output is a tuple consisting of a cleaned text (all CHAT-annotations applied) and a list of Metadata .
For the cases at hand, it will turn the CHAT codes [/] and [//] into appropriate metadata.
There is no need to run this specifically for this set, because metadata for all AnnCor utterances must be generated by using this function
In the current parses, the following CHAT annotated constructs are ignored:
[/]
[//]
[///]
The following steps should be implemented and executed:
top/top
node, and prepend it to the originaltop/top
node@begin
and@end
@id
0...n