currently not shown, because their structure is different in three important ways:
their chunks contain single tokens, separated by newlines [that might be annotated, i.e. followed by space and annotations like pos tags]
their chunks include a sentence map that contains the positions of lengths of sentences, but no value; the last sentence might be longer than the chunk, indicating that the sentence continues into the next chunk (where the next part of the sentence is marked as the first sentence)
all positions and lengths (in sentence, article, and date maps) are given in tokens, not bytes
currently not shown, because their structure is different in three important ways: