nginx / njs

A subset of JavaScript language to use in nginx
http://nginx.org/en/docs/njs/
BSD 2-Clause "Simplified" License
1.24k stars 155 forks source link

Segmentation with NJS_UNICODE_ERROR = 0x1FFFFF #435

Closed crasyangel closed 3 years ago

crasyangel commented 3 years ago

njs version: 0.7.0

#0  0x00000000008ee5b0 in njs_utf8_encode ()
#1  0x000000000089d8d0 in njs_string_prototype_to_lower_case ()
#2  0x00000000008b37b6 in njs_function_native_call ()
#3  0x00000000008b3df8 in njs_function_frame_invoke ()
#4  0x0000000000899648 in njs_vmcode_interpreter ()
#5  0x00000000008b3d72 in njs_function_lambda_call ()
#6  0x00000000008b3df1 in njs_function_frame_invoke ()
#7  0x0000000000895840 in njs_vm_invoke ()
#8  0x000000000089586d in njs_vm_call ()
#9  0x00000000005ba9d4 in ngx_js_call ()
stringA is 'kpg,\367ebt',  note '\367' is just a oct, not unicode
decodeURIComponent(stringA).toLowerCase()

Should fail when decodeURIComponent with error: "malformed URI", but actually not. njs_string_decode_uri->njs_string_decode_uri_cp(expect_percent is zero)->njs_utf8_decode->NJS_UNICODE_ERROR, not 0xFFFFFFFF, and cannot goto uri_error

In previous versions, njs_utf8_decode2/njs_utf8_decode return 0xFFFFFFFF when error. Why change to NJS_UNICODE_ERROR/0x1FFFFF

xeioex commented 3 years ago

@crasyangel ,

Thanks for the report.

Feel free to test the following patch: https://gist.github.com/xeioex/e62bc5667f10bf4ca27282cb2e94ac50

Please, also note that the problem may occur only with byte strings. We are in the process of deprecating them. Byte strings may contain invalid UTF-8 sequences. And in JS, according the spec, this is invalid string. This is why some methods may fail, because they expect valid UTF-8 sequences.

At this moment (0.7.0) the only way to create them is through njs_vm_value_string_set(), njs_vm_value_string_alloc() APIs (this API is used in r.variables, r.requestText).

If you need a type for arbitrary byte sequences use Buffer or Uint8Array. There are also Request Buffer properties like r.rawVariables or r.requestBuffer.

crasyangel commented 3 years ago

@crasyangel ,

Thanks for the report.

Feel free to test the following patch: https://gist.github.com/xeioex/e62bc5667f10bf4ca27282cb2e94ac50

Please, also note that the problem may occur only with byte strings. We are in the process of deprecating them. Byte strings may contain invalid UTF-8 sequences. And in JS, according the spec, this is invalid string. This is why some methods may fail, because they expect valid UTF-8 sequences.

At this moment (0.7.0) the only way to create them is through njs_vm_value_string_set(), njs_vm_value_string_alloc() APIs (this API is used in r.variables, r.requestText).

If you need a type for arbitrary byte sequences use Buffer or Uint8Array. There are also Request Buffer properties like r.rawVariables or r.requestBuffer.

Good Patch. Problem Solved