averbis / IRuta

A UIMA Ruta kernel for Jupyter Notebook.
Apache License 2.0
7 stars 1 forks source link

Evaluation mode is slow for many files #11

Open DavidHuebner opened 3 years ago

DavidHuebner commented 3 years ago

Is your feature request related to a problem? Please describe. In the ncbi_annotated.zip, there are around 900 files containing gold-standard diagnoses and predicted diagnoses. The following IRuta notebook cell uses this information to run an evaluation.

%displayMode EVALUATION
%evalTypes TrueDiagnosis
%loadTypeSystem data/typesystem/typesystem.xml
%inputDir data/gold_test/
PredictedDiagnosis{->TrueDiagnosis};

This quickly iterates over the files, but when generating the evaluation results it may take up to 30 seconds which is slightly annoying when quick iterations are the goal.

Describe the solution you'd like I would the evaluation module to be faster.

Describe alternatives you've considered In some use-cases, I am not interested in the results per document. If this is the cause for the slow performance, then a flag to disable those results may be useful.

Additional context IRuta Version 0.2.0