Mosallamy / refdiff-python

Python refactoring extraction tool for GitHub commit history
6 stars 4 forks source link

Extract and Inline Function Refactorings #14

Open rodrigo-brito opened 3 years ago

rodrigo-brito commented 3 years ago

I found some issues with function calls The parser identifies the function calls very well, but this relationship is not defined in nodes.

To extract and inline function works, we need to create relationships with the nodes:

root.getRelationships().add(new CstNodeRelationship(CstNodeRelationshipType.USE, caller.getId(),
                        nodeByAddress.get(functionCall).getId()));

In RefDiff we have two relationships to setup up (USE for call functions and SUBTYPE for inheritance)

We have the information to construct the USE relationship. But, it is difficult to identify nodes. The snippet above uses a map that contains all nodes found in the parser step, in which the KEY of the map should be a unique identifier for each node. For the following example https://github.com/rodrigo-brito/refactoring-python-example/blob/master/main.py, it should look like:

{
   /main.py: <CST Node of main.py>,
   /main.py/isEven: <CST Node of isEven function>,
   /main.py/printIsEven: <CST Node of printIsEven function>,
   /calculator.py: <CST Node of calculator.py>,
   /calculator.py/Calculator: <CST Node of Calculator Class>,
   /calculator.py/Calculator/summary: <CST Node of summary Function>,
   /calculator.py/Calculator/multiply: <CST Node of multiply Function>,
}

The KEY format is free, but it should be equal between commits. Then, a good way to construct this key could be joining Parent KEY + Node name.

And functionCalls map, should contain the relationship between these unique IDs. For example, in the example repository. The CST Node /main.py should contain function calls for summary, multiply, and printIsEven. Also, printIsEven should contains a function call for isEven function.

For this example, functionCalls must be something like this:

{
   /main.py: [
      /calculator.py/Calculator/summary,
      /calculator.py/Calculator/multiply,
      /main.py/printIsEven,
   ],
   /main.py/printIsEven: [
      /main.py/isEven
   ]
}

If it is done, I think all other refactorings will work.

rodrigo-brito commented 3 years ago

I will try to work on it during the week, but if you found some solution for this, please comment here.

Mosallamy commented 3 years ago

I ran the code in your repo https://github.com/rodrigo-brito/refactoring-python-example/blob/master/main.py and I got the below results. The relationships in root are identified correctly and of type USE. Should it not work now and identify the Extract_Method correctly?

image

rodrigo-brito commented 3 years ago

No, in this example, we have four relationships: /main.py USE /calculator.py/Calculator/summary /main.py USE /calculator.py/Calculator/multiply, /main.py USE /main.py/printIsEven, /main.py/printIsEven USE /main.py/isEven

Also, The Address is not unique, we have: image

rodrigo-brito commented 3 years ago

@Mosallamy, In my computer I got a single relationship. Does your printscreen represent the last commit? image

Mosallamy commented 3 years ago

@rodrigo-brito yes it is up-to date

QassemNa commented 3 years ago

Hi @rodrigo-brito, Currently I think there are multiple things that I have identified and should be fixed

rodrigo-brito commented 3 years ago

Hi @QassemNa

should Class also have one too?

I think that in Python we are unable to make function calls in a Class outside a function body, right? So I don't think you need to do it.

Just to make sure when you said uniquely identified between different commits

A string with location/name is ok (actual format). A numeric ID that can be problematic. It is difficult to keep it the same for each CST Node between different commits. This is used only in that part of building function calls

Do you have the code for the function getFilesFromPath?

I think it is an internal function of RefDiff, I will take a look on this.

QassemNa commented 3 years ago

Ok thanks, I will start working on that.

QassemNa commented 3 years ago

Hello @rodrigo-brito , I have been able to extract the refactoring inlineand extractbut now I have been facing a problem with rename, it still shows rename but it is not reliable, like if I run it in your repo for python example it would show me all refactoring except rename. But at times it would show them, from what I have seen it works great with class rename If you want to take a look and try it out yourself you can see the code in my account.

rodrigo-brito commented 3 years ago

That is great! The rename refactoring just checks the simpleName field. I will try to check with your example today and report here. Maybe, I will take look on it at night.