Closed matzekuh closed 8 months ago
There isn't a lot for us to go on here. The C++ stacktrace seems to be absent. If you can get us a stacktrace from a debug build that would be great, if you can isolate a small piece of php that will cause this crash that would be even better. Thanks!
I installed hhvm-nightly-dbg
. The problem only appears when loading the dokuwiki syntax page (doku.php?id=wiki:syntax). I will try to figure out what exactly happens before hhvm crashes.
Stacktrace:
Host: ####
ProcessID: 30201
ThreadID: 7f525f7ff700
ThreadPID: 30216
Name: unknown program
Type: Segmentation fault
Runtime: hhvm
Version: heads/master-0-g163b2627434f48963ba334d90f668ffbc96e067e
DebuggerCount: 0
Server: ####
ThreadType: Web Request
Server_SERVER_NAME: ####
URL: /wiki/doku.php?id=wiki:syntax
# 0 bt_handler at /tmp/tmp.58XxoBHyUx/hphp/runtime/base/crash-reporter.cpp:71
doku.php:
<?php
/**
* DokuWiki mainscript
*
* @license GPL 2 (http://www.gnu.org/licenses/gpl.html)
* @author Andreas Gohr <andi@splitbrain.org>
*
* @global Input $INPUT
*/
// update message version
$updateVersion = 46.2;
// xdebug_start_profiling();
if(!defined('DOKU_INC')) define('DOKU_INC', dirname(__FILE__).'/');
if(isset($_SERVER['HTTP_X_DOKUWIKI_DO'])) {
$ACT = trim(strtolower($_SERVER['HTTP_X_DOKUWIKI_DO']));
} elseif(!empty($_REQUEST['idx'])) {
$ACT = 'index';
} elseif(isset($_REQUEST['do'])) {
$ACT = $_REQUEST['do'];
} else {
$ACT = 'show';
}
// load and initialize the core system
require_once(DOKU_INC.'inc/init.php');
//import variables
$INPUT->set('id', str_replace("\xC2\xAD", '', $INPUT->str('id'))); //soft-hyphen
$QUERY = trim($INPUT->str('id'));
$ID = getID();
$REV = $INPUT->int('rev');
$DATE_AT = $INPUT->str('at');
$IDX = $INPUT->str('idx');
$DATE = $INPUT->int('date');
$RANGE = $INPUT->str('range');
$HIGH = $INPUT->param('s');
if(empty($HIGH)) $HIGH = getGoogleQuery();
if($INPUT->post->has('wikitext')) {
$TEXT = cleanText($INPUT->post->str('wikitext'));
}
$PRE = cleanText(substr($INPUT->post->str('prefix'), 0, -1));
$SUF = cleanText($INPUT->post->str('suffix'));
$SUM = $INPUT->post->str('summary');
//parse DATE_AT
if($DATE_AT) {
$date_parse = strtotime($DATE_AT);
if($date_parse) {
$DATE_AT = $date_parse;
} else { // check for UNIX Timestamp
$date_parse = @date('Ymd',$DATE_AT);
if(!$date_parse || $date_parse === '19700101') {
msg(sprintf($lang['unable_to_parse_date'], $DATE_AT));
$DATE_AT = null;
}
}
}
//check for existing $REV related to $DATE_AT
if($DATE_AT) {
$pagelog = new PageChangeLog($ID);
$rev_t = $pagelog->getLastRevisionAt($DATE_AT);
if($rev_t === '') { //current revision
$REV = null;
$DATE_AT = null;
} else if ($rev_t === false) { //page did not exist
$rev_n = $pagelog->getRelativeRevision($DATE_AT,+1);
msg(sprintf($lang['page_nonexist_rev'],
strftime($conf['dformat'],$DATE_AT),
wl($ID, array('rev' => $rev_n)),
strftime($conf['dformat'],$rev_n)));
$REV = $DATE_AT; //will result in a page not exists message
} else {
$REV = $rev_t;
}
}
//make infos about the selected page available
$INFO = pageinfo();
//export minimal info to JS, plugins can add more
$JSINFO['id'] = $ID;
$JSINFO['namespace'] = (string) $INFO['namespace'];
// handle debugging
if($conf['allowdebug'] && $ACT == 'debug') {
html_debug();
exit;
}
//send 404 for missing pages if configured or ID has special meaning to bots
if(!$INFO['exists'] &&
($conf['send404'] || preg_match('/^(robots\.txt|sitemap\.xml(\.gz)?|favicon\.ico|crossdomain\.xml)$/', $ID)) &&
($ACT == 'show' || (!is_array($ACT) && substr($ACT, 0, 7) == 'export_'))
) {
header('HTTP/1.0 404 Not Found');
}
//prepare breadcrumbs (initialize a static var)
if($conf['breadcrumbs']) breadcrumbs();
// check upstream
checkUpdateMessages();
$tmp = array(); // No event data
trigger_event('DOKUWIKI_STARTED', $tmp);
//close session
session_write_close();
//do the work (picks up what to do from global env)
act_dispatch();
$tmp = array(); // No event data
trigger_event('DOKUWIKI_DONE', $tmp);
// xdebug_dump_function_profile(1);
?>
Dokuwiki input that causes hhvm to crash:
<HTML>
This is some <span style="color:red;font-size:150%;">inline HTML</span>
</HTML>
Information about dokuwiki syntax: https://www.dokuwiki.org/wiki:syntax#embedding_html_and_php
The output should be a code block showing the html code between the tags with highlighting, as HTML/PHP-Embedding is not allowed in the configuration.
The same code in <html></html>
-tags is also causing hhvm to crash.
Stacktrace:
Host: ####
ProcessID: 33427
ThreadID: 7fec87fff700
ThreadPID: 33432
Name: unknown program
Type: Segmentation fault
Runtime: hhvm
Version: heads/master-0-g163b2627434f48963ba334d90f668ffbc96e067e
DebuggerCount: 0
Server: ####
ThreadType: Web Request
Server_SERVER_NAME: ####
URL: /test/dokuwiki/doku.php?id=start
# 0 bt_handler at /tmp/tmp.58XxoBHyUx/hphp/runtime/base/crash-reporter.cpp:71
PHP Stacktrace:
#0 GeSHi->parse_non_string_part( <<|UR1|"http://december<DOT>com/html/4/element/span<DOT>html"><|/2/>span|></a> style=) called at [/var/www/service/test/dokuwiki/inc/geshi.php:2568]
#1 GeSHi->parse_code() called at [/var/www/service/test/dokuwiki/inc/parserutils.php:726]
#2 p_xhtml_cached_geshi(This is some <span style="color:red;font-size:150%;">inline HTML</span>, html4strict, pre) called at [/var/www/service/test/dokuwiki/inc/parser/xhtml.php:543]
#3 Doku_Renderer_xhtml->html(
This is some <span style="color:red;font-size:150%;">inline HTML</span>
, pre) called at [/var/www/service/test/dokuwiki/inc/parser/xhtml.php:555]
#4 Doku_Renderer_xhtml->htmlblock(
This is some <span style="color:red;font-size:150%;">inline HTML</span>
) called at [/var/www/service/test/dokuwiki/inc/parserutils.php:607]
#5 p_render(xhtml, Array, ) called at [/var/www/service/test/dokuwiki/inc/parserutils.php:113]
#6 p_cached_output(/var/www/service/test/dokuwiki/data/pages/start.txt, xhtml, start) called at [/var/www/service/test/dokuwiki/inc/parserutils.php:76]
#7 p_wiki_xhtml(start, 0, 1, ) called at [/var/www/service/test/dokuwiki/inc/html.php:246]
#8 html_show() called at [/var/www/service/test/dokuwiki/inc/template.php:105]
#9 tpl_content_core() called at [/var/www/service/test/dokuwiki/inc/events.php:108]
#10 Doku_Event->trigger(tpl_content_core, 1) called at [/var/www/service/test/dokuwiki/inc/events.php:231]
#11 trigger_event(TPL_ACT_RENDER, show, tpl_content_core) called at [/var/www/service/test/dokuwiki/inc/template.php:82]
#12 tpl_content() called at [/var/www/service/test/dokuwiki/lib/tpl/dokuwiki/main.php:59]
#13 include(/var/www/service/test/dokuwiki/lib/tpl/dokuwiki/main.php) called at [/var/www/service/test/dokuwiki/inc/actions.php:206]
#14 act_dispatch() called at [/var/www/service/test/dokuwiki/doku.php:119]
I'm almost sure, that there are more syntax snippets that cause hhvm to crash but i didn't manage to find all of them yet.
Further testing brought up, that the error is not caused by the <html></html>
respectively <HTML></HTML>
tags but by their content. For example the following code does not cause a crash:
<html>
<p style="border:2px dashed red;">And this is some block HTML</p>
</html>
<HTML>
<p style="border:2px dashed red;">And this is some block HTML</p>
</HTML>
EDIT: I tried the <html></html>
-block with different html tags. It looks like the <span>
-tag is the only tag that causes an error.
I suspect this may be related to #4108 but again, I don't really have a way of testing this. Generally when we're looking for an isolated example of a crash we're hoping for something short and in a single file that won't involve running an entire framework.
Is that the full C++ backtrace? There should be more lines than bt_handler.
@fredemmott Actually it is the full backtrace. I tried several times, the output did not change.
Is there any status update on this issue? I just ran into the exact same problem with hhvm 3.9.1 and dokuwiki 2015-08-10.
I spent some time investigating further. HHVM crashes on following part of geshi->parse_non_string_part();
foreach (array_keys($this->language_data['KEYWORDS']) as $k) {
// {...}
//NEW in 1.0.8, the cached regexp list
// since we don't want PHP / PCRE to crash due to too large patterns we split them into smaller chunks
for ($set = 0, $set_length = count($this->language_data['CACHED_KEYWORD_LISTS'][$k]); $set < $set_length; ++$set) {
$keywordset =& $this->language_data['CACHED_KEYWORD_LISTS'][$k][$set];
// Might make a more unique string for putting the number in soon
// Basically, we don't put the styles in yet because then the styles themselves will
// get highlighted if the language has a CSS keyword in it (like CSS, for example ;))
$stuff_to_parse = preg_replace_callback(
"/$disallowed_before_local({$keywordset})(?!\<DOT\>(?:htm|php|aspx?))$disallowed_after_local/$modifiers",
array($this, 'handle_keyword_replace'),
$stuff_to_parse
);
}
}
If I comment out $stuff_to_parse = preg_replace_callback(...) everything runs smoothly.
edit:// ok, seems like HHVM runs in a recursion or something like that:
If I'm using the html4strict-profile of GeSHi (which is causing the problem in case of DokuWiki) and parse a string $string="string" then the HTML-Ouput generated by this function will be:
<pre class="html4strict" style="font-family:monospace;">string</pre>
For any other HTML tag:
<pre class="html4strict" style="font-family:monospace;"><span style="color: #009900;"><<span style="color: #66cc66;">/</span><a href="http://december.com/html/4/element/table.html"><span style="color: #000000; font-weight: bold;">table</span></a>></span></pre>
If I'm try to parse "<span>
", "<pre>
" or "</span>
", HHVM will crash with segfault (thus "</pre>
" does work and I don't get, why).
If I'm now commenting the preg_replace_callback() call the result for parsing "</span>
" is:
<pre class="html4strict" style="font-family:monospace;"><span style="color: #009900;"><<span style="color: #66cc66;">/</span>span></span></pre>
This seems somehow logical as the keywords "span" & "pre" are defined by the $language_data: https://github.com/GeSHi/geshi-1.0/blob/master/src/geshi/html4strict.php
Hope this is helpful...
Just a random thought -- is /e
included in the $modifiers
to the call to epreg_replace_callback()
?
'CASE_SENSITIVE' => array(
GESHI_COMMENTS => false,
2 => false,
3 => false,
),
$case_sensitive = $this->language_data['CASE_SENSITIVE'][$k];
$modifiers = $case_sensitive ? '' : 'i';
echo $modifiers."\n";
Results in (caused by the loop): i i
The function seems to work fine during the first run of the foreach loop. During the second run the function fails.
Closing as there has been no progress in 8 years. If you or anyone reading this can manage to create a default with a known input (and regex), please file a new issue.