Closed khituras closed 5 years ago
Jain, as we Germans say, it is at least not a bug. There are no components of the respective type in the pipeline, thus the empty array. I never implemented logic to leave out the file completely and I also think that it might actually be nice to specifically see that there are no components instead of wondering if the file just got lost. Am 17. Juli 2019, 18:14 +0200 schrieb Michel Oleynik notifications@github.com:
@michelole commented on this pull request. In uima/extra-to-xmi-db-pipeline/cmDescriptions.json:
@@ -0,0 +1 @@ +[] ditto — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub, or mute the thread.
Short question: does it change the synonyms as well, i.e. should we re-run experiments?
It shouldn't change the synonyms. Changes made: Added the CUI, which is excluded when reading the synonyms, and doing case-sensitive de-duplication of the file, the previous file had duplicated synonyms in it. However, the de-duplication was made in Java anyway by use of sets.
While there are a lot of changes in the branch, the main thing to note - because it will break current installations - is the format change of the UMLS synset provider. The script has been adapted to create the new format. I mainly do this as a PR so that noone is missing the fact that a recreation of the UMLS synset is required.