Closed jimhester closed 6 years ago
Yeah, it basically assumes the the input is coming from crayon
, which is balanced, unless you use start
and finish
. But yeah, we do need a better parser.
See also #6
Doesn't look easily adaptable but https://github.com/theZiz/aha has a C implementation.
@jimhester It is easier to fix the current parser. Or write another one from scratch. IMO.
FWIW I am in the process of doing just that with fansi. I'll let you know if I make progress there. Hoping to get something useful in the next few weeks.
@gaborcsardi: In the tibble case, input comes from crayon.
@krlmlr how so?
Nested styles. This is with f27619b, current master is broken in a different way, PR follows.
$ R --debugger valgrind -q -e 'g <- crayon::green; b <- crayon::blue; ansistrings::ansi_to_html(paste0(g("1"), g(b("1")), g(b(g("1"))), g(b(g(b("1"))))))'
==16146== Memcheck, a memory error detector
==16146== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==16146== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==16146== Command: /usr/lib/R/bin/exec/R -q -e g~+~\<-~+~crayon::green;~+~b~+~\<-~+~crayon::blue;~+~ansistrings::ansi_to_html(paste0(g("1"),~+~g(b("1")),~+~g(b(g("1"))),~+~g(b(g(b("1"))))))
==16146==
==16146== Conditional jump or move depends on uninitialised value(s)
==16146== at 0x58051A7: __wcsnlen_avx2 (strlen-avx2.S:261)
==16146== by 0x5739CAA: wcsrtombs (wcsrtombs.c:104)
==16146== by 0x56C1740: wcstombs (wcstombs.c:34)
==16146== by 0x504CF62: wcstombs (stdlib.h:154)
==16146== by 0x504CF62: tre_parse_bracket_items (tre-parse.c:325)
==16146== by 0x504CF62: tre_parse_bracket (tre-parse.c:442)
==16146== by 0x504CF62: tre_parse (tre-parse.c:1369)
==16146== by 0x5044A15: tre_compile (tre-compile.c:1920)
==16146== by 0x50421D9: tre_regcompb (regcomp.c:150)
==16146== by 0x4F69A3B: do_gsub (grep.c:1831)
==16146== by 0x4F3D427: bcEval (eval.c:6473)
==16146== by 0x4F4CEBF: Rf_eval (eval.c:624)
==16146== by 0x4F4F2CE: R_execClosure (eval.c:1614)
==16146== by 0x4F4D085: Rf_eval (eval.c:747)
==16146== by 0x4F510D5: do_set (eval.c:2585)
==16146==
==16146== Conditional jump or move depends on uninitialised value(s)
==16146== at 0x56AB7F3: internal_utf8_loop (loop.c:298)
==16146== by 0x56AB7F3: __gconv_transform_internal_utf8 (skeleton.c:609)
==16146== by 0x5739CDD: wcsrtombs (wcsrtombs.c:110)
==16146== by 0x56C1740: wcstombs (wcstombs.c:34)
==16146== by 0x504CF62: wcstombs (stdlib.h:154)
==16146== by 0x504CF62: tre_parse_bracket_items (tre-parse.c:325)
==16146== by 0x504CF62: tre_parse_bracket (tre-parse.c:442)
==16146== by 0x504CF62: tre_parse (tre-parse.c:1369)
==16146== by 0x5044A15: tre_compile (tre-compile.c:1920)
==16146== by 0x50421D9: tre_regcompb (regcomp.c:150)
==16146== by 0x4F69A3B: do_gsub (grep.c:1831)
==16146== by 0x4F3D427: bcEval (eval.c:6473)
==16146== by 0x4F4CEBF: Rf_eval (eval.c:624)
==16146== by 0x4F4F2CE: R_execClosure (eval.c:1614)
==16146== by 0x4F4D085: Rf_eval (eval.c:747)
==16146== by 0x4F510D5: do_set (eval.c:2585)
==16146==
==16146== Conditional jump or move depends on uninitialised value(s)
==16146== at 0x56AB7FC: internal_utf8_loop (loop.c:303)
==16146== by 0x56AB7FC: __gconv_transform_internal_utf8 (skeleton.c:609)
==16146== by 0x5739CDD: wcsrtombs (wcsrtombs.c:110)
==16146== by 0x56C1740: wcstombs (wcstombs.c:34)
==16146== by 0x504CF62: wcstombs (stdlib.h:154)
==16146== by 0x504CF62: tre_parse_bracket_items (tre-parse.c:325)
==16146== by 0x504CF62: tre_parse_bracket (tre-parse.c:442)
==16146== by 0x504CF62: tre_parse (tre-parse.c:1369)
==16146== by 0x5044A15: tre_compile (tre-compile.c:1920)
==16146== by 0x50421D9: tre_regcompb (regcomp.c:150)
==16146== by 0x4F69A3B: do_gsub (grep.c:1831)
==16146== by 0x4F3D427: bcEval (eval.c:6473)
==16146== by 0x4F4CEBF: Rf_eval (eval.c:624)
==16146== by 0x4F4F2CE: R_execClosure (eval.c:1614)
==16146== by 0x4F4D085: Rf_eval (eval.c:747)
==16146== by 0x4F510D5: do_set (eval.c:2585)
==16146==
==16146== Conditional jump or move depends on uninitialised value(s)
==16146== at 0x56AB843: internal_utf8_loop (loop.c:298)
==16146== by 0x56AB843: __gconv_transform_internal_utf8 (skeleton.c:609)
==16146== by 0x5739CDD: wcsrtombs (wcsrtombs.c:110)
==16146== by 0x56C1740: wcstombs (wcstombs.c:34)
==16146== by 0x504CF62: wcstombs (stdlib.h:154)
==16146== by 0x504CF62: tre_parse_bracket_items (tre-parse.c:325)
==16146== by 0x504CF62: tre_parse_bracket (tre-parse.c:442)
==16146== by 0x504CF62: tre_parse (tre-parse.c:1369)
==16146== by 0x5044A15: tre_compile (tre-compile.c:1920)
==16146== by 0x50421D9: tre_regcompb (regcomp.c:150)
==16146== by 0x4F69A3B: do_gsub (grep.c:1831)
==16146== by 0x4F3D427: bcEval (eval.c:6473)
==16146== by 0x4F4CEBF: Rf_eval (eval.c:624)
==16146== by 0x4F4F2CE: R_execClosure (eval.c:1614)
==16146== by 0x4F4D085: Rf_eval (eval.c:747)
==16146== by 0x4F510D5: do_set (eval.c:2585)
==16146==
> g <- crayon::green; b <- crayon::blue; ansistrings::ansi_to_html(paste0(g("1"), g(b("1")), g(b(g("1"))), g(b(g(b("1"))))))
==16146== Use of uninitialised value of size 8
==16146== at 0x4052640: ansistrings_make_ansi_map1 (map.c:177)
==16146== by 0x4F0E839: R_doDotCall (dotcode.c:596)
==16146== by 0x4F0EDC6: do_dotcall (dotcode.c:1252)
==16146== by 0x4F4D4DC: Rf_eval (eval.c:728)
==16146== by 0x4F4FF68: do_begin (eval.c:2192)
==16146== by 0x4F4D2BB: Rf_eval (eval.c:700)
==16146== by 0x4F4F2CE: R_execClosure (eval.c:1614)
==16146== by 0x4F4D085: Rf_eval (eval.c:747)
==16146== by 0x4FE8618: R_DispatchOrEvalSP (subset.c:623)
==16146== by 0x4FE8618: do_subset3 (subset.c:1208)
==16146== by 0x4F4D2BB: Rf_eval (eval.c:700)
==16146== by 0x4F510D5: do_set (eval.c:2585)
==16146== by 0x4F4D2BB: Rf_eval (eval.c:700)
==16146==
==16146== Invalid read of size 4
==16146== at 0x4052640: ansistrings_make_ansi_map1 (map.c:177)
==16146== by 0x4F0E839: R_doDotCall (dotcode.c:596)
==16146== by 0x4F0EDC6: do_dotcall (dotcode.c:1252)
==16146== by 0x4F4D4DC: Rf_eval (eval.c:728)
==16146== by 0x4F4FF68: do_begin (eval.c:2192)
==16146== by 0x4F4D2BB: Rf_eval (eval.c:700)
==16146== by 0x4F4F2CE: R_execClosure (eval.c:1614)
==16146== by 0x4F4D085: Rf_eval (eval.c:747)
==16146== by 0x4FE8618: R_DispatchOrEvalSP (subset.c:623)
==16146== by 0x4FE8618: do_subset3 (subset.c:1208)
==16146== by 0x4F4D2BB: Rf_eval (eval.c:700)
==16146== by 0x4F510D5: do_set (eval.c:2585)
==16146== by 0x4F4D2BB: Rf_eval (eval.c:700)
==16146== Address 0x120ad3264 is not stack'd, malloc'd or (recently) free'd
==16146==
*** caught segfault ***
address 0x120ad3264, cause 'memory not mapped'
Traceback:
1: .Call("ansistrings_make_ansi_map1", str, re$.match$start[[1]], re$.match$end[[1]], re$.match$match[[1]], re$start$start[[1]], re$start$match[[1]], re$end$match[[1]], PACKAGE = "ansistrings")
2: make_ansi_map1(str)
3: FUN(X[[i]], ...)
4: vapply(text, ansi_to_html1, character(1), USE.NAMES = FALSE)
5: ansistrings::ansi_to_html(paste0(g("1"), g(b("1")), g(b(g("1"))), g(b(g(b("1"))))))
An irrecoverable exception occurred. R is aborting now ...
==16146==
==16146== Process terminating with default action of signal 11 (SIGSEGV)
==16146== at 0x5478FD9: raise (raise.c:46)
==16146== by 0x547914F: ??? (in /lib/x86_64-linux-gnu/libpthread-2.26.so)
==16146== by 0x405263F: ansistrings_make_ansi_map1 (map.c:177)
==16146==
==16146== HEAP SUMMARY:
==16146== in use at exit: 59,827,236 bytes in 25,092 blocks
==16146== total heap usage: 51,602 allocs, 26,510 frees, 117,642,371 bytes allocated
==16146==
==16146== LEAK SUMMARY:
==16146== definitely lost: 0 bytes in 0 blocks
==16146== indirectly lost: 0 bytes in 0 blocks
==16146== possibly lost: 0 bytes in 0 blocks
==16146== still reachable: 59,827,236 bytes in 25,092 blocks
==16146== of which reachable via heuristic:
==16146== newarray : 4,264 bytes in 1 blocks
==16146== suppressed: 0 bytes in 0 blocks
==16146== Rerun with --leak-check=full to see details of leaked memory
==16146==
==16146== For counts of detected and suppressed errors, rerun with: -v
==16146== Use --track-origins=yes to see where uninitialised values come from
==16146== ERROR SUMMARY: 1443 errors from 6 contexts (suppressed: 0 from 0)
Segmentation fault
Interestingly, I don't see this behavior in #18.
Wontfix, use fansi please.
It looks to me like the active array is not being initialized properly in this case.