swc-project / swc

Rust-based platform for the Web
https://swc.rs
Apache License 2.0
31.01k stars 1.21k forks source link

Comments around jsx omitted while printing #9561

Open oosawy opened 2 weeks ago

oosawy commented 2 weeks ago

Describe the bug

Some comments around jsx omitted while printing just parsed program. This issue seems not to happen to the playground that transform jsx.

Input:

function C() {
    return /*#FOO*/<>{/*#BAR*/}hello world</> /*#BAZ*/ ;
}

Printed:

function C() {
    return <>{}hello world</> /*#BAZ*/ ;
}

Reproduction:

use swc::config::SourceMapsConfig;
use swc::{Compiler, PrintArgs};
use swc_common::comments::SingleThreadedComments;
use swc_common::sync::Lrc;
use swc_common::FileName;
use swc_common::{
    errors::{ColorConfig, Handler},
    SourceMap,
};
use swc_ecma_ast::EsVersion;
use swc_ecma_parser::TsSyntax;
use swc_ecma_parser::{lexer::Lexer, Parser, StringInput, Syntax};

fn main() {
    let cm: Lrc<SourceMap> = Default::default();
    let handler = Handler::with_tty_emitter(ColorConfig::Auto, true, false, Some(cm.clone()));

    let fm = cm.new_source_file(
        FileName::Real("input.js".into()).into(),
        // "function C(){ return /*#FOO*/<>{/*#BAR*/}hello world</>/*#BAZ*/; }".into(),
        "function C() {\n    return /*#FOO*/<>{/*#BAR*/}hello world</> /*#BAZ*/ ;\n}".into(),
    );

    let comments = SingleThreadedComments::default();

    let lexer = Lexer::new(
        Syntax::Typescript(TsSyntax {
            tsx: true,
            ..Default::default()
        }),
        EsVersion::latest(),
        StringInput::from(&*fm),
        Some(&comments),
    );

    let mut parser = Parser::new_from(lexer);

    for e in parser.take_errors() {
        e.into_diagnostic(&handler).emit();
    }

    let program = parser.parse_program().unwrap();

    let compiler = Compiler::new(cm);

    let result = compiler
        .print(
            &program,
            PrintArgs {
                comments: Some(&comments),
                source_map: SourceMapsConfig::Bool(false),
                ..Default::default()
            },
        )
        .expect("failed to print program");

    println!("{}", result.code);

    assert_eq!(
        result.code,
        "function C() {\n    return <>{}hello world</> /*#BAZ*/ ;\n}\n"
    );

    println!("{:#?}", comments);

    println!("{:#?}", program);
}

Input code

No response

Config

N/A

Playground link (or link to the minimal reproduction)

https://play.swc.rs/?version=1.7.26&code=H4sIAAAAAAAAA0srzUsuyczPU3DW0FSo5uVSAIKi1JLSojwFfS1lN39%2FLX0bu2og08kxSEu%2FNiM1JydfoTy%2FKCfFRt9OASwepaWvYM3LVQsANOpalksAAAA%3D&config=H4sIAAAAAAAAA1WPMQ7CMAxFd05ReWYAJsSGmBg4hBVcFNTEke0gqqp3J4FSyma%2Fl2%2F9DKumgbs6ODRDGcuSUJRk3gvRPho%2BCwHrE6kTnwzWX2talUmmGV3JsaCxaDEtdrpQfcTg3TkkFptybzd%2BnoAJRm1ZwrKBEDpbgIpyNB%2BotsJsHNC8g0mP%2FxdRblTTQLrbbPdTdUhC5aMPOnbdiUOgaPor9A6DchZHF0yzGV%2FQfYRPMQEAAA%3D%3D

SWC Info output

No response

Expected behavior

Comments are printed / preserved.

Actual behavior

Some comments around jsx are omitted.

Version

swc = "0.287.0"

Additional context

versions:

swc = "0.287.0"
swc_common = { version = "0.38.0", features = ["tty-emitter"] }
swc_core = "0.104.2"
swc_ecma_ast = "0.119.0"
swc_ecma_codegen = "0.156.1"
swc_ecma_parser = "0.150.0"
swc_ecma_transforms = "0.241.0"
swc_ecma_visit = "0.105.0"
swc_visit = "0.6.2"
CPunisher commented 1 week ago

preserveAllComments matters.

oosawy commented 1 week ago

@CPunisher I tried preserve_all_comments: true with compiler.process_js_with_custom_pass() but it doesn't work...

Code:

use swc::config::{Config, JscConfig, Options};
use swc::Compiler;
use swc_common::comments::SingleThreadedComments;
use swc_common::sync::Lrc;
use swc_common::{
    errors::{ColorConfig, Handler},
    SourceMap,
};
use swc_common::{FileName, Globals};
use swc_ecma_ast::EsVersion;
use swc_ecma_parser::TsSyntax;
use swc_ecma_parser::{lexer::Lexer, Parser, StringInput, Syntax};
use swc_ecma_transforms::pass::noop;

fn main() {
    let cm: Lrc<SourceMap> = Default::default();
    let handler = Handler::with_tty_emitter(ColorConfig::Auto, true, false, Some(cm.clone()));

    let fm = cm.new_source_file(
        FileName::Real("input.js".into()).into(),
        // "function C(){ return /*#FOO*/<>{/*#BAR*/}hello world</>/*#BAZ*/; }".into(),
        "function C() {\n    return /*#FOO*/<>{/*#BAR*/}hello world</> /*#BAZ*/ ;\n}".into(),
    );

    let comments = SingleThreadedComments::default();

    let lexer = Lexer::new(
        Syntax::Typescript(TsSyntax {
            tsx: true,
            ..Default::default()
        }),
        EsVersion::latest(),
        StringInput::from(&*fm),
        Some(&comments),
    );

    let mut parser = Parser::new_from(lexer);

    for e in parser.take_errors() {
        e.into_diagnostic(&handler).emit();
    }

    let program = parser.parse_program().unwrap();

    let compiler = Compiler::new(cm);

    let result = swc_common::GLOBALS.set(&Globals::new(), || {
        compiler
            .process_js_with_custom_pass(
                fm,
                Some(program.clone()),
                &handler,
                &Options {
                    config: Config {
                        jsc: JscConfig {
                            preserve_all_comments: true.into(),
                            ..Default::default()
                        },
                        ..Default::default()
                    },
                    ..Default::default()
                },
                comments.clone(),
                |_| noop(),
                |_| noop(),
            )
            .expect("failed to print program")
    });

    println!("{}", result.code);

    assert_eq!(
        result.code,
        "function C() {\n    return <>{}hello world</>;\n} /*#BAZ*/ \n"
    );

    println!("{:#?}", comments);

    println!("{:#?}", program);
}

Output:

function C() {
    return <>{}hello world</>;
} /*#BAZ*/ 
CPunisher commented 1 week ago

You also need to config jsx syntax:

                    config: Config {
                        jsc: JscConfig {
                            syntax: Some(Syntax::Typescript(TsSyntax {
                                tsx: true,
                                ..Default::default()
                            })),
                            preserve_all_comments: true.into(),
                            ..Default::default()
                        },
                        ..Default::default()
                    },

😂 It's true there are many issues in handling comments with swc.

kdy1 commented 1 week ago

I really agree.. I hope someone came up with an idea that works with Rust without making AST types too big

oosawy commented 1 week ago

I noticed it's similar to the playground when the tsx option is enabled, as comments are mostly kept in their original position. But I want to create a code fixer or codemod that only modifies the specific parts of the AST while leaving other AST and comments unchanged.

Output:

function C() {
    return /*#FOO*/ /*#__PURE__*/ React.createElement(React.Fragment, null, /*#BAR*/ \"hello world\");
} /*#BAZ*/

I am wondering why comments are not output in their original position if you print a program that has just been parsed. Is there a good point or guide to start looking into it?

CPunisher commented 1 week ago

Here is my explanation, but I'm not sure it's 100% right. When parsing, swc records all comments and their positions in a separate data stucture rather than ast nodes. When generating code, swc adds comments back by positions as many as possible. However, the positions of tokens may be changed after doing transformations with ast. For example, jsx will be parsed to function calls to React.createElement. When preserve_all_comments is enable, swc uses a heuristic algorithm to fix the position shifts. https://github.com/swc-project/swc/blob/14cfd70ee00938497ce6b59f68332f9daa17378b/crates/swc/src/dropped_comments_preserver.rs#L8