koknat / callGraph

A multi-language tool which parses source code for function definitions and calls
GNU General Public License v3.0
245 stars 28 forks source link

Does not recognise all R function definitions #2

Closed jamesscottbrown closed 3 years ago

jamesscottbrown commented 3 years ago

When scanning code in R files, callGraph searches for function definitions using the regex

 ^(\s*)()(\w+)\s+<-\s+function\s*\(.*\)

Newlines between arguments

The . wildcard doesn't match newlines, so this fails to detect function definitions like:

example_function <- function(a_long_argument_name,
                             another_rather_long_argument_name,
                             a_third_argument)...

An alternative would be to replace . with the character set matching everything except closing parentheses ([^\)]):

 ^(\s*)()(\w+)\s+<-\s+function\s*\([^\)]*\) 

Use of = instead of ->

Whilst using <- for assignment is preferred, = may be used. Any functions defined in this way will not be recognised.

This could be fixed by replacing the <- with (<-|=).

koknat commented 3 years ago

Thanks for your feedback

I'm thinking to change the regex to : ^(\s)()(\w+)\s+(?:<-|=)\s+function\s(

This would allow = for assignment.

Also, it will require the opening parenthesis, but not the closing parenthesis. This would match better, since the regex is performed per line, and some functions have newlines between arguments.

Your thoughts?

koknat commented 3 years ago

I have made this change for R

I've also changed the regexes for several other languages, so that only the opening parenthesis '(' is required