denoland / deno

A modern runtime for JavaScript and TypeScript.
https://deno.com
MIT License
94.05k stars 5.23k forks source link

BUG: `deno doc --lint` crash with non-ASCII ts code #23875

Open secext2022 opened 4 months ago

secext2022 commented 4 months ago

Version: Deno 1.43.4

reproduce step:

source code t.ts:

// 插件描述文件 `pmimp.json`
export interface 插件描述 {
  pmim_version: string;
  插件信息: {
    名称: string;
    描述: string;
    版本: string;
    URL: string;
  };
  默认启用?: number;

  双拼方案?: string;
  键盘布局?: string;

  皮肤?: {
    入口: string;
    名称: string;
    能力: Array<string>;
  };

  服务?: {
    入口: string;
  };
}

run command and crash log:

> env RUST_BACKTRACE=1 ./deno doc --lint t.ts
error[missing-jsdoc]: exported symbol is missing JSDoc documentation
 --> /home/tmp/deno-crash-20240518/t.ts:2:1
  | 
2 | export interface 插件描述 {
  | ^

error[missing-jsdoc]: exported symbol is missing JSDoc documentation
 --> /home/tmp/deno-crash-20240518/t.ts:3:3
  | 
3 |   pmim_version: string;
  |   ^

============================================================
Deno has panicked. This is a bug in Deno. Please report this
at https://github.com/denoland/deno/issues/new.
If you can reliably reproduce this panic, include the
reproduction steps and re-run with the RUST_BACKTRACE=1 env
var set and include the backtrace in your report.

Platform: linux x86_64
Version: 1.43.4
Args: ["./deno", "doc", "--lint", "t.ts"]

thread 'main' panicked at /home/runner/.cargo/registry/src/index.crates.io-6f17d22bba15001f/dprint-swc-ext-0.16.0/src/common/text_info.rs:190:21:
byte index 94 is not a char boundary; it is inside '插' (bytes 93..96) of `// 插件描述文件 `pmimp.json`
export interface 插件描述 {
  pmim_version: string;
  插件信息: {
    名称: string;
    描述: string;
    版本: string;
    URL: string;
  };
  默认启用?: number;

  双拼方案?: string;
  键盘布局`[...]
stack backtrace:
   0: rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::str::slice_error_fail_rt
   3: core::str::slice_error_fail
   4: dprint_swc_ext::common::text_info::SourceTextInfo::range_text
   5: deno_ast::diagnostics::print_snippet
   6: <deno_ast::diagnostics::DiagnosticDisplay<T> as core::fmt::Display>::fmt
   7: core::fmt::write
   8: core::fmt::write
   9: std::io::Write::write_fmt
  10: deno::util::logger::init::{{closure}}
  11: <env_logger::Logger as log::Log>::log::{{closure}}
  12: <env_logger::Logger as log::Log>::log
  13: log::__private_api::log_impl
  14: deno::tools::doc::doc::{{closure}}
  15: deno::spawn_subcommand::{{closure}}
  16: <deno_unsync::task::MaskFutureAsSend<F> as core::future::future::Future>::poll
  17: tokio::runtime::task::raw::poll
  18: deno::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
MujahedSafaa commented 2 months ago

Hello @marvinhagemeister Could you please confirm if the issue should be resolved by allowing non-ASCII characters?

marvinhagemeister commented 2 months ago

Deno should never panic, so yes.

MujahedSafaa commented 2 months ago

Hi @marvinhagemeister,

I've investigated this issue and found that in the code snippet that I attached below is attempting to extract a substring from self.text_str() based on the start and end positions specified in a SourceRange. The panic occurs because the code is slicing the string at byte indices that are not valid UTF-8 character boundaries. In Rust, strings are UTF-8 encoded, and incorrect slicing can cause panics. This issue originates from the Rust external library dprint-swc-ext-0.16.0. image

My suggestions are as follows:

Replace the range_text Method: The problematic code is being called from the deno-ast repository inside the diagnostics.rs file. I suggest replacing the range_text method with a new method that correctly handles non-ASCII characters. image

Or Add Error Messages: Introduce error messages indicating that non-ASCII characters are not supported to prevent Deno from panicking.

Please let me know if you have any other ideas or suggestions.

marvinhagemeister commented 2 months ago

looping in @dsherret who has more context on dprint.