Fraunhofer-AISEC / cpg

A library to extract Code Property Graphs from C/C++, Java, Go, Python, Ruby and every other language through LLVM-IR.
https://fraunhofer-aisec.github.io/cpg/
Apache License 2.0
246 stars 59 forks source link

Duplicate problem on quantum-cpg branch #1541

Open carlosgpal opened 3 weeks ago

carlosgpal commented 3 weeks ago

I don't understand very well why the code duplicates many nodes that should be the same, they have the same fullname and all the properties except the id and relates sometimes to different and sometimes to the same nodes

Here you have a screenshot of how for example it behaves with a cx circuit (I have omitted some nodes so you can see well what I mean) image

This is with a file with a single cx circuit. This is the file:

include "qelib1.inc";
qreg q[2];
h q[0];
cx q[0],q[1];

Same happens with other quantum gates such as the h quantum gate in that example, and sometimes happens with quantum nodes image

Non quantum specific nodes do not behave this way

Same behaviour found when analysing files with qiskit.

from qiskit import QuantumCircuit
circuito = QuantumCircuit(2)
circuito.h(0)
circuito.cx(0, 1)
KuechA commented 3 weeks ago

Hi @carlosgpal,

I agree that the first graph seems to have too many occurrences of cx (in particular the pink nodes I'd say) and we'll look into this.

However, I think the nodes shown second graph are as I would expect them to be. The main reason why there are "duplicate" nodes here is because we distinguish between the different references (e.g. when the same variable is used in 5 expressions, you will encounter 5 references) and the variable declaration (e.g., in C, you'd write int x;). In the Q-CPG, we use a similar approach for the qubits and differentiate between the qubit thinking of it as a physical memory location (the pink node) and the usages in the different operations (e.g. lines of the code).

carlosgpal commented 3 weeks ago

I understand.

Some extra information about the gates (it doesn't matter if they are cx or h). The duplicate node Statements (the blue ones) have the QuantumGate labels only 1 of them. The rest have the same labels except that one. In Declaration this does not happen even if there are the same number of duplicates.