terryyin / lizard

A simple code complexity analyser without caring about the C/C++ header files or Java imports, supports most of the popular languages.
Other
1.82k stars 248 forks source link

Python language does not support multi-line quotes #335

Open dmctavish opened 2 years ago

dmctavish commented 2 years ago

hoping for some thoughts on this from @terryyin

Python inherits its comment detection from ScriptLanguageMixIn, which only supports the hash character The below still doesn't support multi-line comments... still trying to figure out how to associate between multiple tokens, perhaps that should be done in the reader

It would be nice if the comment fields were promoted as first-level members on the language-specific readers that way extensions could have access to them easily? for instance, if the PythonReader has a member variable: comment_delimiters = ["#"] multi_line_comment_delimiters = [(r"'''", r"'''")]

whereas Java would have something like: comment_delimiters = [r"//"] multi_line_comment_delimiters = [(r"/", r"/")]

Then the reader could check if a comment has been started or not, and only yield the token after the multi-line comment is closed? I'm guessing the reader would look something like this:

def __call__(self, tokens, reader):
    if not hasattr(reader.context.current_function, "comment_count"):
        reader.context.current_function.comment_count = 0

    for token in tokens:
        if is_in_comment or token in reader.comment_delimiters:
            reader.context.current_function.comment_count += 1
        if is_in_comment and token in reader.multi_line_comment_delimiters:
            yield token
        if token in reader.multi_line_comment_delimiters:
            is_in_comment = True
        if token in reader.comment_delimiters:
            reader.context.current_function.comment_count += 1
        if not is_in_comment and token not in reader.comment_delimiters:
            yield token

======================= in the short term, I think the python file should over-ride the get_comment_from_token function as follows:

class PythonReader(CodeReader, ScriptLanguageMixIn): @staticmethod def get_comment_from_token(token): if token.startswith("#"): return token[1:] elif token.startswith(r"'''"): return token[3:]