Open t-lini opened 6 years ago
Possible solution: introduce functionality to divide a text into a user-defined number of chunks of equal size on token-basis, then add the relevant chunk number to each row (token) of the mtext dataframe. Other solutions would be to compute chunks on the basis of sentences (would require automatic sentence segmentation) or individual figure speeches (which might be tricky because speeches highly differ in length).
Possible solution: introduce functionality to divide a text into a user-defined number of chunks of equal size on token-basis, then add the relevant chunk number to each row (token) of the mtext dataframe.
This is simple. Example for segmentation into 100 chunks of equal size:
t <- rksp.0$mtext
t[["chunk"]] <- cut(seq_along(t$Token.surface), 100, labels = FALSE)
Implementation for 2.x: override "act"-column with chunks.
Different way of representing the temporal dimension of drama: map text of drama onto some concept of scaled time that is independent from its segmentation.