Open HiragiChi opened 8 months ago
non-function objects can be holding the value-flows of pointers which point to a function object. It is not easy to separately handle them.
Hi, Thanks for the reply. I think there are some non-function objects (like void pointers) that can actually point to functions or function pointers, but there are some objects and pointers (like int or char*) that cannot point to functions ( we can use bitcast analysis to confirm that). We can conservatively drop objects with these types in Anderson analysis.
With these restrictions, do you think that sounds a valid idea?
How about a pointer points to a struct containing a function pointer. Similar for pointers indirectly affected
Yeah, and a union type is also something that might be related to function pointers. I plan to mark these four types: void*, function pointer types, structs that contain function pointers, and unions that contain function pointers, as types that may be related to call graph construction. By pruning the other types of nodes I think we will reach higher efficiency.
Do you think this sounds a little bit more reasonable now? Any suggestions are appreciated!
In theory, any pointer types matter, as they could indirectly point to an object (not only struct/array/union) which may store pointers that indirectly point to a function object.
Thanks for the timely reply! That is true. What I am thinking is can this process be tracked by bitcast analysis? (say fptr1 is assigned to int, so there must be a bitcast in the program that casts int to fptr, or there exists a casting chain that casts int to fptr.) In this case, we can incorporate all the int objects into Anderson analysis. I think this claim sounds solid currently. Do you think that would still miss something? Thanks!
You can have a try. It is hard to answer this question as this is a challenge research. Note that llvm-16 onwards does not have typed pointers which means castings will be gone for many IR instructions.
Thanks for the guidance! I will try my idea with LLVM-14 and I will let you know if I have any progress :)
Hi, Thanks for providing such a useful tool. I am interested in getting a call graph (indirect call included) for large-scale programs (like Linux kernel) using Anderson analysis. However, using the raw command will exhaust the memory, resulting in a timeout.
As I am only interested in point-to results for function pointers to get the call graph, I am thinking of ways to make the icall points-to result-solving procedure more effective. believe there are many corner cases I need to handle, but the rough idea will be like this
Do you think this idea sounds valid, or is there a better way to do this?