Open Ansa211 opened 7 years ago
I think that IDs should not be tested while import. It should be tested during conversion from pml to sql.
Sql and pml ids must be the same because it is needed in printserver and suggest. There should be some other tool, such as some xslt template, that fixes ids in pml. And then conversion should be run.
non-unique ids have a number of unexpected bad effects:
http://hdl.handle.net/11346/PMLTQ-RCZU
the Suggest function outputs pretty random stuff
some nodes are highlighted when they should not be: http://hdl.handle.net/11346/PMLTQ-LWKE (there is a
$noun
node falsely highlighted in the sentence with the$verb
)[ ] test uniqueness of ids during import of new corpora
[ ] if ids are not unique, update them to "filename-id" during importing the treebank to SQL