Currently the text is always lower cased, which (i) do not correspond to the "original" JBT implementation, (ii) will cause misunderstanding from the side of Chris as he is often prefer to keep the original case:
Make an additional command line boolean parameter which would be control use of lowercasing. Set this parameter by default to true (lowercase everything as now).
Motivation
Currently the text is always lower cased, which (i) do not correspond to the "original" JBT implementation, (ii) will cause misunderstanding from the side of Chris as he is often prefer to keep the original case:
https://github.com/uhh-lt/josimtext/blob/master/src/main/scala/de/uhh/lt/jst/dt/Text2TrigramTermContext.scala#L35
Implementation
Make an additional command line boolean parameter which would be control use of lowercasing. Set this parameter by default to true (lowercase everything as now).