Closed BenjaminKrautter closed 5 years ago
Update: Wenn ich über loadMeta()
alle Dramen ohne Publikationsdatum rauswerfe, funktioniert es ebenfalls (sind dann noch ~400 Dramen).
Der Fehler taucht auch auf, wenn ein einzelnes Drama mit loadSegmentedText()
geladen werden soll, dessen UtterancesWithTokens.csv
bis auf die Spaltennamen leer ist, wie zum Beispiel hier. Man könnte loadSegmentedText()
bzw. loadText()
so erweitern, dass in einem solchen Fall eine Warnung ausgegeben und die entsprechenden IDs überspungen werden. Was meinst du, @nilsreiter ?
Ja, das wäre sicher sinnvoll.
For now, this only skips those csv-files that are empty for 3.x. This means that for dramas with broken UtterancesWithTokens.csv
-files, the other files (Segmentation, Meta, etc) are still loaded into the QDDrama object. Do we want to skip the whole drama in these cases?
Yes, I think skipping the entire drama would be better.
Mit
ids.all <-
loadAllInstalledIds()und
loadSegmentedText(ids.all)` bekomme ich folgende Fehlermeldung:Error in data.table::foverlaps(t, sat, type = "any", by.x = c("corpus", : The last two columns in by.x should correspond to the 'start' and 'end' intervals in data.table 'x' and must be integer/numeric type. In addition: Warning message: In
[.data.table(sat, is.na(Number.Scene),
:=(Number.Scene = 0), : Coerced double RHS to character to match the type of the target column (column 8 named 'Number.Scene'). If the target column's type character is correct, it's best for efficiency to avoid the coercion and create the RHS as type character. To achieve that consider R's type postfix: typeof(0L) vs typeof(0), and typeof(NA) vs typeof(NA_integer_) vs typeof(NA_real_). You can wrap the RHS with as.character() to avoid this warning, but that will still perform the coercion. If the target column's type is not correct, it's best to revisit where the DT was created and fix the column type there; e.g., by using colClasses= in fread(). Otherwise, you can change the column type now by plonking a new column (of the desired type) over the top of it; e.g. DT[,
Number.Scene:=as.double(
Number.Scene)]. If the RHS of := has nrow(DT) elements then the assignment is called a column plonk and is the way to change a column's type. Column types can be observed with sapply(DT,typeof).
mit
text.all.l <- lapply(ids.all, loadSegmentedText)
die gleiche:Error in data.table::foverlaps(t, sat, type = "any", by.x = c("corpus", : The last two columns in by.x should correspond to the 'start' and 'end' intervals in data.table 'x' and must be integer/numeric type.
Beides funktioniert, wenn ich bspw. über
loadSet("tragoediel")
usw. eine entsprechend geringere Zahl an Dramen laden möchte.