semi colons in underlying data

Yeah, realized this when manually editing the 0-cases of the "check_segments". There are actually more problems: I also found instances of " ~ " being used as a separator between words.

The genereal problem is that it is close to impossible to find all characters that people use to separate two words in their entries. Sometimes it's "/", sometimes it's "~", sometimes it's just a space, sometime sthey put stuff in brackets, etc.

Actually, I don't know why I missed the splitting of segments containing a semi-colon in a first instance. This is rather long time ago when I made the first preparation of the data, even before the app was running. I just checked the entries: there are only 39 entries containing a semi-colon, which is probably the reason, why it was missed.

I just corrected all these entries manually, just choosing one of the two possible variant words. You can find the cases I edited by looking for "1 @ lingulist" in the "check_segments".

By doing this manual check, we may loose a few interesting word forms initially, but since it is only 39 cases, it should be no problem to manually re-insert them from the "original_entry" column in case it turns out to be necessary later.

Right now, it is probably more important to get the variation out of the data. We can re-introduce it, ones we can handle it.

Please re-open this issue if you don't agree with my decision.

digling / burmish

semi colons in underlying data #16