terryyin / lizard

A simple code complexity analyser without caring about the C/C++ header files or Java imports, supports most of the popular languages.
Other
1.79k stars 246 forks source link

Henry's & Kafura's Fan-In Fan-Out Complexity metric #102

Open mehrdad89 opened 8 years ago

mehrdad89 commented 8 years ago

I was wondering if it would be interesting to add this metric as another feature for Lizard. In short, Fan-in is the number of links coming into a node, whereas fan-out is the number of arrows going out of a node. I'll be happy to contribute if you are also interested @terryyin?

Thanks in advance.

terryyin commented 8 years ago

Yes, I have an ambition of implementing the Internal Relationship Diagram as recommended in @ michaelfeathers 's book (Working Effectively With Legacy Code).

Having Fan-In and Fan-Out is a good start towards that direction.

mehrdad89 commented 8 years ago

Great! I will look into this metric more in a detailed manner, however, I was wondering if you are fine with including an option for the relationship between classes and functions in a project scoop. Unfortunately, this may lead to caring about headers and import in source code. In the meantime, I will continue working on adding new support for nesting depth metric too.

terryyin commented 8 years ago

Right, Scoping the biggest problem. So to have fan-in and fan-out is a more pragmatic first step.

I’ve moved a bit of the Nesting Depth code to ext already. Will continue to more and make the ext interface more friendly to both users and us developers.

On 3 Mar 2016, at 8:52 PM, Mehrdad Meh notifications@github.com wrote:

Great! I will look into this metric more in a detailed manner, however, I was wondering if you are fine with including an option for the relationship between classes and functions in a project scoop. Unfortunately, this may lead to caring about headers and import in source code. In the meantime, I will continue working on adding new support for nesting depth metric too.

— Reply to this email directly or view it on GitHub https://github.com/terryyin/lizard/issues/102#issuecomment-191750452.

mehrdad89 commented 8 years ago

Great! Thanks for that. I will continue working on both metrics and how to implement them as of today then.

tobias-klein commented 8 years ago

I'd be interested in the fan-in fan-out metrics as well! :)

mehrdad89 commented 8 years ago

I am working on it. hopefully, it will be finished next week.

mehrdad89 commented 8 years ago

@terryyin I was wondering what is your opinion on how print_fan_in_fan_out should be handled. I just finished writing a working prototype of the fan-in fan-out function in the lizard.py. you can check it out in my forked repository in lizard.py I am open to suggestion if you are interested in any aspect of the code or the way I calculated fan_in_fan_out method.

mehrdad89 commented 8 years ago

@terryyin I just used an example from your repository i.e mahjong\src\html_ui\html_game.c This is the summary of the output:

C:\Python27\python.exe C:/Users/emehmeh/Documents/GitHub/lizard/lizard.py C:\mahjong-master\src\html_ui\html_game.c . . .

1 file analyzed.

NLOC Avg.NLOC AvgCCN Avg.token function_cnt file

202      14.6     3.8       90.1        12     C:\mahjong-master\src\html_ui\html_game.c

fan-in fan-out file

  1      1        start_new_player@26-36@C:\mahjong-master\src\html_ui\html_game.c
  3      0        do_user_do_not_exist_error@38-41@C:\mahjong-master\src\html_ui\html_game.c
  1      2        script_to_update_all_holdings@43-54@C:\mahjong-master\src\html_ui\html_game.c
  1      3        generate_ui_event_script@56-106@C:\mahjong-master\src\html_ui\html_game.c
  7      2        html_game_do_action@108-117@C:\mahjong-master\src\html_ui\html_game.c
  1      1        script_to_bye@119-123@C:\mahjong-master\src\html_ui\html_game.c
  1      0        show_byebye@126-131@C:\mahjong-master\src\html_ui\html_game.c
  0     11        execute_game_command@133-186@C:\mahjong-master\src\html_ui\html_game.c
  3      0        ui_adaptor_pool_get_item_by_id@200-207@C:\mahjong-master\src\html_ui\html_game.c
  1      1        ui_adaptor_pool_add_ui_adaptor@209-218@C:\mahjong-master\src\html_ui\html_game.c
  3      1        ui_adaptor_pool_get_ui_adaptor_by_id@220-227@C:\mahjong-master\src\html_ui\html_game.c
  1      1        ui_adaptor_pool_remove@228-236@C:\mahjong-master\src\html_ui\html_game.c

. . . .

Process finished with exit code 0

terryyin commented 8 years ago

@mehrdad89 "mahjong" was the example code for refactoring many years ago. Now I realise it actually hurt my reputation:-)

If you look at the current lizard_ext/lizardnd.py

class LizardExtension(object):  # pylint: disable=R0903
    FUNCTION_CAPTION = "  ND  "
    FUNCTION_INFO_PART = "max_nesting_depth"
    AVERAGE_CAPTION = " Avg.ND "

    def __call__(self, tokens, reader, l_depth=0):  # pylint: disable=R0912
        pass

That defines a lizard extension. FUNCTION_INFO_PART defines the member variable name in FunctionInfo that this extension is going to add. FUNCTION_CAPTION is the caption that appears in the output. It also defines the width. If there is also a AVERAGE_CAPTION, it will put the average in the file summary and final summary.

__call__ will be called by lizard with all the tokens. You can do all you jobs there including adding the variable defined in FUNCTION_INFO_PART.

When lizard is called with

lizard -E nd

The above extension will be included in the execution.

The current problem for fan_in and fan_out is there are two variables. I will update the extension framework to make it accept a list.

I hope you can do this incrementally by sending small yet functioning pull requests, so that I can feedback early as well. What do you think?

mehrdad89 commented 8 years ago

Thanks for the detailed explanation. I'll be fine with your approach. I will try to follow your code, and find a way to add them. I will also try to write some tests for both of them.

mehrdad89 commented 8 years ago

I totally forgot to explain my approach to extract Structural fan-in fan-out. The function needs to be aware of the fun.location or the location of each method/function which has been processed in the repository. Can I access that information in the lizard-ext part of the lizard? or this is going to be an issue in the long run?

terryyin commented 8 years ago

Yes, you can get the current function by reader.context.current_function.

On 16 Mar 2016, at 9:07 AM, Mehrdad Meh notifications@github.com wrote:

I totally forgot to explain my approach to extract Structural fan-in fan-out. The function needs to be aware of the fun.location or the location of each method/function which has been processed in the repository. Can I access that information in the lizard-ext part of the lizard? or this is going to be an issue in the long run?

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/terryyin/lizard/issues/102#issuecomment-197093139

mehrdad89 commented 8 years ago

Yes, but i still need to be aware of the other functions to look for fan-in and fan-out which i still haven't looked at. Therefore i need all of them to start the operation

terryyin commented 8 years ago

In that case, if you have a method name “reduce” in your extension class, it will be called at the end with the whole FileInformation structure, you can find all the functions within that file there.

def reduce(self, file_info):
      pass
      # do the fan_in fan_out counting here

You probably all want to keep all the file_info passed to your reduce function so that the fan_in fan_out is aware of all the functions.

On 16 Mar 2016, at 9:51 AM, Mehrdad Meh notifications@github.com wrote:

Yes, but i still need to be aware of the other functions to look for fan-in and fan-out which i still haven't looked at. Therefore i need all of them to start the operation

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/terryyin/lizard/issues/102#issuecomment-197106950

mehrdad89 commented 8 years ago

Thanks for that. I'll look into it tomorrow. In the meantime, i will wait for your fix on the printing issue. keep me in touch.

KenLau commented 8 years ago

Hihi, i found that there are some problems with the fan in and fan out in Objective C and Swift,

NLOC CCN token PARAM length fan_in fan_out location

   9      2     37      0      10         0          0 sharedInstance@21-30
  10      3     78      0      11         0          0 trackStartAppNetworkStatusWithOnline:@28-38

I do call these function from other class, however, the fan in and fan out do not have the correct count, do all of you encounter this problem? thx

mehrdad89 commented 8 years ago

Thank you for your input. At this point, fan-in fan-out is implemented at a local scope, therefore, they have to be explicitly called in to be counted. Of course, there will be more update coming to fix some of these issues.

terryyin commented 8 years ago

Let’s focus on getting it to work with C first. Swift and ObjC are very similar to C.

On 27 Apr 2016, at 5:31 PM, Mehrdad Meh notifications@github.com wrote:

Thank you for your input. At this point, fan-in fan-out is implemented at a local scope, therefore, they have to be explicitly called in order to be counted. Of course, there will be more update coming to fix some of these issues.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/terryyin/lizard/issues/102#issuecomment-215026415

mehrdad89 commented 8 years ago

Of course, that is the primary focus.

KenLau commented 8 years ago

Alright, thanks for your replies, :D

tobias-klein commented 8 years ago

@mehrdad89 Is the fan-in / fan-out going to be calculated both on a function and on a module/file level?

terryyin commented 8 years ago

@mehrdad89 I'm at Shanghai for a few days and will try this feature again with my friend @erizhang . Do you want a short video chat tomorrow morning your time?

mehrdad89 commented 8 years ago

Sure, let's do that! email me the time when you can make it!

mehrdad89 commented 8 years ago

@tobias-klein we are trying to add these feature. but, at the moment, it is working on a function level.

terryyin commented 8 years ago

Sorry, we are both in a conference and might be a bit late (6pm). Sent from my iPhone

On 5 May 2016, at 8:36 PM, Mehrdad Meh notifications@github.com wrote:

Sure, let's do that!

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub

terryyin commented 8 years ago

guys, let me present you the new fan-in and fan-out implementation

lizard -Eio

It calculates only fan-in/fan-out for known functions within all the code being analysed. At this moment I ignored the namespaces. Function gets all the fan-in until another function with duplicated name appears in the process.

It will be great of you can try this out and give me some feedback:-)

tobias-klein commented 8 years ago

Thanks for doing this, @terryyin Will try it out next week. Have you thought about adding a file/module dimension to the measurement? fan-in/fan-out on the file/module level would be even more interesting I think.

rakhimov commented 8 years ago

Hello folks, I don't have much deep understanding of this metric, but I get that this metric relates more to the software as a whole (coupling/cohesion). Just running couple of checks for this metric with C++ code, I came to the following conclusion.

In order to be aware of what functions are being called (for Fan-In metric, I guess), the Lizard must be aware of such pretty nasty language complexities as qualified/unqualified/argument-dependent name lookup, function call overload resolution, awareness of types and the type system, templates and template type deduction. I am likely forgetting some more.

In short, I am afraid Lizard needs a fully featured C++ compiler front-end to get this metric implemented properly for C++. I bet Lizard hasn't been envisioned to implement one. This is a hard problem.

The current popular solution is to leverage clang, which is what oclint does for its metrics. However, I am not aware of any tool that took this approach for Fan-In/Out metrics.

My guess is that Fan-In/Out metrics are of very different nature than the currently implemented metrics in Lizard for C++, so the work may substantially be different and language specific.

This is my two cents as uninitiated with this metric development, so what are your thoughts/plans on C++?

terryyin commented 8 years ago

@rakhimov I think you have very good point about the difficulty of getting Fan-in/Out. I'm looking for a cost-efficient way of approximating the actual fan-in/out and is also useful.

@mehrdad89 and I also came up with an idea of "general fan-out", which counts anything in a function that "looks like a fan-out."

I'm not sure if the current way of doing the metrics is any useful. But there are many aspects that this can be improved. Which simple improvement would make the biggest improvement to the usefulness of our "approximate fan-in/out"?

terryyin commented 8 years ago

@tobias-klein yes, I think file level fan-in/out is useful. Let me stabilise/improve the current metrics first.

mehrdad89 commented 8 years ago

@terryyin I am open to suggestion terry so we would be on the same track on this matter. Of course, you can email me or skype me.

mehrdad89 commented 8 years ago

@terryyin I am still interested on how you think we should approach improving the fan-out by counting the unfamiliar commands (meaning functions, struct, and etc.). Unfortunately, I think the problem is that these operators are language specific. However, it is the right step in that direction. What do you think?

terryyin commented 8 years ago

Let’s step backward a bit. What’s the purpose of the fan-out?

On 31 May 2016, at 5:45 PM, Mehrdad Meh notifications@github.com wrote:

@terryyin https://github.com/terryyin I am still interested on how you think we should approach improving the fan-out by counting the unfamiliar commands (meaning functions, struct, and etc.). Unfortunately, I think the problem is that these operators are language specific. However, it is the right step in that direction. What do you think?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/terryyin/lizard/issues/102#issuecomment-222641777, or mute the thread https://github.com/notifications/unsubscribe/AAwJYrHhy09eNep_KZuSzSkJq8uD6wiBks5qHANOgaJpZM4HniLN.

mehrdad89 commented 8 years ago

From what I understood, it means: calculating the number of file or procedure that are called by this file or procedure on a file or procedure level

rakhimov commented 7 years ago

I discovered another metric called 'Cumulative Component Dependency' (CCD/average/normalized) invented by John Lakos in his book "Large Scale C++ Software Design". Honestly, as a developer or designer, I do not see much value in and can't internalize the 'fan-in/fan-out'. On the other hand, CCD is higher level metric than 'fan-in/fan-out', but it's somewhat orthogonal to the 'fan-in/fan-out'. It deals with software components instead of low level function/class/etc. CCD has solid theoretical backing and the analysis born from actual software development practice. (i.e., it is not just academic but practical.) Even though the analysis is done on C++ components/packages/package groups, I believe the metric can be made language agnostic as long as the language have 'module/package' abstractions, for example, Python->modules. Theoretically, anything that can be described by graphs can be analyzed with CCD.

John Lakos provided C tools to calculate the metric with his book (20 yrs ago). Zhichang Yu rewrote the C code into Python 6yrs ago. The project looked stale to me, and I forked it into my repo. I have been refactoring ever since. I have been contemplating the possibility of integrating the script as Lizard extension, but it is hard to provide the CCD analysis configurations via command-line. If we could create .lizard.yml config file (like other linter/analysis tools: pylint, pep8, etc.), CCD integration could be smoother.

I could use some help if anyone is interested in hacking on this metric.