Nykakin / chompjs

Parsing JavaScript objects into Python data structures
MIT License
194 stars 11 forks source link

ValueError parsing comments with open brackets #64

Open clayadavis opened 1 week ago

clayadavis commented 1 week ago

Using version 1.3.0 installed from pip.

Both of the following fail to parse:

chompjs.parse_js_object('// [ \n{"a": 1}')

chompjs.parse_js_object('/* [ */ {"a": 1}')

The error is as follows:

ValueError: Error parsing input near character 14
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File <command-2221575987808166>, line 3
      1 import chompjs
----> 3 chompjs.parse_js_object('// [\n{"a": 1}')

File <snip>/python3.10/site-packages/chompjs/chompjs.py:113, in parse_js_object(string, unicode_escape, loader, loader_args, loader_kwargs, json_params)
    108 loader_args, loader_kwargs = _process_loader_arguments(
    109     loader_args, loader_kwargs, json_params
    110 )
    112 string = _preprocess(string, unicode_escape)
--> 113 parsed_data = parse(string)
    114 return loader(parsed_data, *loader_args, **loader_kwargs)

ValueError: Error parsing input near character 14

This comes up in the wild when JS is stored as CDATA like this:

/* <![CDATA[ */
...
/* ]]> */
Nykakin commented 1 week ago

Implemented a solution in https://github.com/Nykakin/chompjs/pull/65 could you test this branch?

Also note that this library is not meant to fully parse JavaScript language and it only starts at first opening JSON character (either [ or {) for quick convenience. If your script contains a function, then its { will be interpreted as opening of JSON and will break the parsing. Sometimes there's no other way but to manually clear your input or use regular expressions to isolate relevant fragment of the script. And if you do that, in many cases using json.loads directly can suffice.