Closed andreblanke closed 4 years ago
Wow, thanks for reporting this. I even had problems coming up with a test to reproduce it since this only seems to happen when the first message contains a time plus several lines (so thanks for doing the hard work of narrowing it down to the reprex you posted). It should work now:
rwhatsapp::rwa_read(x = c("08.02.20, 17:35 - First Last: The time is 17:36.",
"2nd line.",
"3rd line.",
"08.02.20, 17:35 - First Last: The time is 17:36.",
"2nd line."))
#> # A tibble: 2 x 6
#> time author text source emoji emoji_name
#> <dttm> <fct> <chr> <chr> <lis> <list>
#> 1 2020-02-08 17:35:26 First L~ "The time is 17:36.\n2~ text in~ <NUL~ <NULL>
#> 2 2020-02-08 17:35:26 First L~ "The time is 17:36.\n2~ text in~ <NUL~ <NULL>
Created on 2020-02-08 by the reprex package (v0.3.0)
Thanks a lot for the quick fix. I thought all other issues in my data set would also stem from this misbehavior, however, it seems there's more situations in which the existing regex is a bit sensitive but I'll file a different issue for those.
The author of a message seems to be incorrectly reported as
NA
if the message text contains both a:
and two or more linebreaks.The following should be a minimum reproducible example:
example.zip
chat0.txt
chat1.txt
test.Rmd