Given that our Klexikon corpus comes with pre-split sentences, and their pre-processing is more extensive what our general library should/can do, it makes sense to differentiate between inputs that are on the sentence-level already split.
One caveat is that we will have to design it such that multi-document summarization cases will be possible to distinguish as well.
Resolved in 9b18acb.
Unfortunately, the processing is now around 3x slower, as individual sentences still have to be processed with spacy, for the lemmatization.
Given that our Klexikon corpus comes with pre-split sentences, and their pre-processing is more extensive what our general library should/can do, it makes sense to differentiate between inputs that are on the sentence-level already split.
One caveat is that we will have to design it such that multi-document summarization cases will be possible to distinguish as well.