org-jdraft / jdraft

Meta Representation for building Java programs to analyze, generate, refactor & run Java source code
8 stars 1 forks source link

Succinct Data Structure Representation of _type #1

Open edefazio opened 4 years ago

edefazio commented 4 years ago

This is a long-term goal--- A tool to take existing _types and _members, and convert them into a succinct data structure or get a succinct data structure and turn it into a _type or _member.

the bidirectional nature succinct data structure to represent _types (_class, _enum, _interface,_annotation) AND underlying _members (_initBlock, _method, _constructor, etc.)

Internally I imagine it'll be similar to bytecode with bytes representing opcodes and linking to names of things in a Lookup table

the purpose of this, is to make looking through code more memory efficient (i.e. I should be able to take TONS of code like the source code of Linix) and query it easily.

Looking through ALL code in a project should be fast & memory efficient (we'll have probably MULTIPLE INDEXES outside of these types that provide information about the Class internals to speed up queries (i.e. feature hashing and or bloom filters ) and internally we'll be able to load and sequentially walk the data structure performing analysis and transformations

more info on succinct data structures. Succinct Data Structure Feature Hashing Bloom Filter

edefazio commented 4 years ago

Generally speaking, I should be able to achieve this by just using the existing infrastructure (for JavaParser/jdraft) to walk and create a serialized form.

Also, I should consider "fully qualifying everything without imports" i.e. directly scoping all static method calls and news and static field accesses as to have less ambiguity and making the code more easily usable so (we dont need to use the Java Symbol Solver, but rather just store the relationship directly in the AST via scope:

IF we have the classes available... it'd be nice to just use something like ClassGraph to build the CallGraph, so we wouldnt have to manually resolve the symbols or use the JavaSymbolSolver

https://github.com/classgraph/classgraph/wiki

i.e. before:

String s = "Hello"
Url url = new Ulr();
out.println("hey");

after:

java.lang.String s = "Hello"
java.net.Url url = new java.net.Url();
System.out.println("hey");

Here are some more (related) ideas about storage/querying/indexing (Finite State Automata/Bitap): https://pvk.ca/Blog/2013/06/23/bitsets-match-regular-expressions/