Closed ablaette closed 2 years ago
Encountered this bug also when inmemory = FALSE
. The not very telling error message is:
Error in rbindlist(dts) :
Internal error: column 3 of result is determined to be integer64 but maxType=='character' != REALSXP``
Adding an expectation on colClasses to the data.table():: fread()
worker helps!
Has been fixed. For processing the data, this also brings an unexpected performance improvement!
This is a documentation of an imminent bug fix. Annotating a corpus I saw the error
The error results from a piece of text that is a character vector
"-9458"
. It yields the conll output:The chosen approach to parse the ConLL output is using the
read.table()
function, which guesses the vector type of columns. The second column in this case is an integer vector, causing the error later on.Easy to fix with argumen
colClasses
: