Open zufuliu opened 2 weeks ago
It's probably worth improving out-of-range behaviour to be more reasonable. That makes it easier to treat end of document the same as other positions. For the string-returning GetRange
and GetRangeLowered
, restricting the end to the end of the document should be OK.
std::string LexAccessor::GetRangeLowered(Sci_PositionU startPos_, Sci_PositionU endPos_) {
const Sci_PositionU endRange = std::min(endPos_, static_cast<Sci_PositionU>(lenDoc));
assert(startPos_ < endRange);
const Sci_PositionU len = endRange - startPos_;
std::string s(len, '\0');
GetRangeLowered(startPos_, endRange, s.data(), len + 1);
return s;
}
For the char-buffer writing versions, filling the array with NUL
then retrieving as much as possible may be OK.
or any XML processing instruction with
xml
prefix, so removedIsASpace()
.
from https://www.w3.org/TR/xml11/#sec-pi, xml prefixed processing instructions are reserved.
The target names "XML", "xml", and so on are reserved for standardization in this or future versions of this specification.
Currently here is <?xml-stylesheet ?>
, https://www.w3.org/TR/xml-stylesheet/
<?xml-stylesheet href="common.css"?>
<?xml-stylesheet href="default.css" title="Default style"?>
<?xml-stylesheet alternate="yes" href="alt.css" title="Alternative style"?>
<?xml-stylesheet href="single-col.css" media="all and (max-width: 30em)"?>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Example with xml-stylesheet processing instructions</title>
</head>
<body>
...
</body>
</html>
I think it (and other xml prefixed instructions) should be handled same as <?xml version="1.0" encoding="utf-8"?>
, so IsASpace()
can be removed.
For the char-buffer writing versions, filling the array with NUL then retrieving as much as possible may be OK.
changes like following? it will do some cheap redundant works already done for string-returning versions.
@@ -32,7 +32,9 @@ bool LexAccessor::MatchIgnoreCase(Sci_Position pos, const char *s) {
void LexAccessor::GetRange(Sci_PositionU startPos_, Sci_PositionU endPos_, char *s, Sci_PositionU len) {
assert(s);
assert(startPos_ <= endPos_ && len != 0);
+ memset(s, '\0', len);
endPos_ = std::min(endPos_, startPos_ + len - 1);
+ endPos_ = std::min(endPos_, static_cast<Sci_PositionU>(lenDoc));
len = endPos_ - startPos_;
if (startPos_ >= static_cast<Sci_PositionU>(startPos) && endPos_ <= static_cast<Sci_PositionU>(endPos)) {
const char * const p = buf + (startPos_ - startPos);
@@ -40,7 +42,6 @@ void LexAccessor::GetRange(Sci_PositionU startPos_, Sci_PositionU endPos_, char
} else {
pAccess->GetCharRange(s, startPos_, len);
}
- s[len] = '\0';
}
void LexAccessor::GetRangeLowered(Sci_PositionU startPos_, Sci_PositionU endPos_, char *s, Sci_PositionU len) {
Changed all the four function to const
and truncate endPos_
to lenDoc
.
Not going to change segIsScriptingIndicator()
, as here is no test for space before xml
(if (!IsASpace(s[t]))
block is not reachable in all existing tests). though (Contains(s, "xml")
block can be optimized to avoid second find()
. PrintScriptingIndicatorOffset-0828.patch
changes for PrintScriptingIndicatorOffset()
is safe and simpler than origin code.
StyleContext::GetCurrent()
can also be marked as const
.
(Contains(s, "xml")
block can be optimized to avoid secondfind()
if (Contains(s, "php"))
return eScriptPHP;
{
const size_t xml = s.find("xml");
if (xml != std::string::npos) {
for (size_t t = 0; t < xml; t++) {
if (!IsASpace(s[t])) {
return prevValue;
}
}
return eScriptXML;
}
}
@@ -103,7 +103,7 @@ script_type segIsScriptingIndicator(const Accessor &styler, Sci_PositionU start,
return eScriptJS;
if (Contains(s, "php"))
return eScriptPHP;
- if (Contains(s, "xml")) {
+ {
const size_t xml = s.find("xml");
if (xml != std::string::npos) {
for (size_t t = 0; t < xml; t++) {
@@ -111,8 +111,8 @@ script_type segIsScriptingIndicator(const Accessor &styler, Sci_PositionU start,
return prevValue;
}
}
+ return eScriptXML;
}
- return eScriptXML;
}
return prevValue;
it seems better to move
Contains(s, "php")
andContains(s, "xml")
cases to a new function, e.g.:
Something like following, not sure whether worth the duplication (keep segIsScriptingIndicator()
unchanged).
@@ -117,6 +117,16 @@ script_type segIsScriptingIndicator(const Accessor &styler, Sci_PositionU start,
return prevValue;
}
+script_type segIsScriptInstruction(Accessor &styler, Sci_PositionU start, bool isXml) {
+ if (styler.MatchIgnoreCase(start, "php")) {
+ return eScriptPHP;
+ }
+ if (isXml || styler.MatchIgnoreCase(start, "xml")) {
+ return eScriptXML;
+ }
+ return eScriptPHP;
+}
+
int PrintScriptingIndicatorOffset(Accessor &styler, Sci_PositionU start) {
return styler.MatchIgnoreCase(start, "php") ? 3 : 0;
}
@@ -1492,7 +1502,7 @@ void SCI_METHOD LexerHTML::Lex(Sci_PositionU startPos, Sci_Position length, int
// handle the start of PHP pre-processor = Non-HTML
else if ((ch == '<') && (chNext == '?') && IsPHPEntryState(state) && IsPHPStart(allowPHP, styler, i)) {
beforeLanguage = scriptLanguage;
- scriptLanguage = segIsScriptingIndicator(styler, i + 2, i + 6, isXml ? eScriptXML : eScriptPHP);
+ scriptLanguage = segIsScriptInstruction(styler, i + 2, isXml);
if ((scriptLanguage != eScriptPHP) && (isStringState(state) || (state==SCE_H_COMMENT))) continue;
styler.ColourTo(i - 1, StateToPrint);
beforePreProc = state;
if ((scriptLanguage != eScriptPHP) && (isStringState(state) || (state==SCE_H_COMMENT))) continue;
needs extra fix for <?xml
or non-preprocessor inside string or comment.
PrintScriptingIndicatorOffset-0828.patch
First place is
PrintScriptingIndicatorOffset(styler, styler.GetStartSegment() + 2, i + 6);
, it can be fixed by changePrintScriptingIndicatorOffset()
to following:segIsScriptingIndicator-0828.patch
Second place is
scriptLanguage = segIsScriptingIndicator(styler, i + 2, i + 6, isXml ? eScriptXML : eScriptPHP);
, as it only make sense to handle<?php
and<?xml
at this position (rest ofsegIsScriptingIndicator()
is used to detect script language fromlanguage
ortype
attribute value), it seems better to moveContains(s, "php")
andContains(s, "xml")
cases to a new function, e.g.:I don't know the purpose of checking space after
xml
, or any XML processing instruction withxml
prefix, so removedIsASpace()
.<script language="php"></script>
was removed in PHP 7 (see https://wiki.php.net/rfc/remove_alternative_php_tags), soContains(s, "php")
can be removed from oldsegIsScriptingIndicator()
.