FrankKai / FrankKai.github.io

FE blog
https://frankkai.github.io/
363 stars 39 forks source link

前端String那些事儿 #163

Open FrankKai opened 4 years ago

FrankKai commented 4 years ago

js中的String其实不仅仅是"foo"这样的字面量字符串。 Blob构造函数的入参array,数组元素可以是USVString,到底什么是USVString让我很困惑。

除了String外,其实还包括以下几种类型的String。 工作中除了String.prototype上的那些好用的方法,es6的模板字符串等等,貌似也没有其他常用字符串的地方了。这里就不再赘述。

参考mdn文档和EcmaScript规范,再结合实际开发中的经验,做一次简单的专项学习。

FrankKai commented 4 years ago

USVString

<script>的charset="utf-8"怎么理解

规范中的说明如下:

If the script element has a charset attribute, then let encoding be the result of getting an encoding from the value of the charset attribute. If the script element does not have a charset attribute, or if getting an encoding failed, let encoding be the same as the encoding of the script element's node document. To get an encoding from a string label, run these steps: Remove any leading and trailing ASCII whitespace from label. If label is an ASCII case-insensitive match for any of the labels listed in the table below, return the corresponding encoding, and failure otherwise.

Name Labels
UTF-8 "unicode-1-1-utf-8","utf-8","utf8"
UTF-16LE "utf-16","utf-16le"

js中存在utf-8 encoder和utf-8 decoder专门进行utf-8的编解码工作。

js中的String采用utf-16格式编码与 <script>的charset=“utf-8”不矛盾吗

不矛盾。utf-16人类友好,utf-8机器友好。 写js代码时,utf-16人类友好。人类可识别。 script utf-8编码时utf-8友好;端到端通信时,utf-8机器友好。机器高效运行。

script编码难道不对utf-16的js string进行编码? 编码。但是js代码中不只有字符串类型,还有Boolean,Number等等一系列类型。不矛盾!

4.3.17String value primitive value that is a finite ordered sequence of zero or more 16-bit unsigned integer values NOTE A String value is a member of the String type. Each integer value in the sequence usually represents a single 16-bit unit of UTF-16 text. However, ECMAScript does not place any restrictions or requirements on the values except that they must be 16-bit unsigned integers.

初见端倪

通过encodeURIComponent和decodeURIComponent可以初见端倪。 首先明确一点。 utf-8格式url(机器友好):"http://foo.test.go.com/index.html#/?from=http%3A%2F%2Fbar.crm.test.go.com%2F&redirectUrl=http%3A%2F%2Fbaz.test.go.com%2Fuser%2FgetCASUser&platformCode=10004" utf-16格式url(人类友好):"http://foo.test.go.com/index.html#/?from=http://bar.crm.test.go.com/&redirectUrl=http://baz.test.go.com/user/getCASUser&platformCode=10004"

encodeURIComponent(uriComponent) 将UTF-16编码的url(其实就是js中的url字符串,“https://www.foo.com?foo=123”)编码为UTF-8格式"https%3A%2F%2Fwww.foo.com%3Ffoo%3D123" decodeURI(encodedURIComponent)将UTF8格式的url 解码为utf-16格式“https://www.foo.com?foo=123”

为什么不用encodeURI?

因为:

Note that encodeURI by itself cannot form proper HTTP GET and POST requests, such as for XMLHTTPRequests, because "&", "+", and "=" are not encoded, which are treated as special characters in GET and POST requests. encodeURIComponent, however, does encode these characters.

不能生成用于HTTP GET或者POST请求的url,因为:

encodeURI Not Escaped: A-Z a-z 0-9 ; , / ? : @ & = + $ - _ . ! ~ * ' ( ) #

小结

FrankKai commented 4 years ago

总结