Compact corpora fix handling of file names with spaces

Compact corpus index entries were read using whitespace as a field delimiter. This leads to the problem that in an entry with spaces, such as

50 gr geroosterd wit en zwart sesamzaad.p.1.s.1.xml     A       1e

'gr' is interpreted as the offset and 'geroosterd' as the size. Since these are also valid base64 strings, this leads to garbage offsets and sizes. Fix this by only using tabs as delimiters.

rug-compling / Alpino

Compact corpora fix handling of file names with spaces #2