AlexeySoshin / smali2java

Recreate Java code from Smali
497 stars 85 forks source link

Enhancement - use antlr to build out parser from grammar / lexar file #11

Open 8secz-johndpope opened 5 years ago

8secz-johndpope commented 5 years ago

to stabilise parser -

I suggest rebuilding some of the code to leverage the antlr grammar / g4 files here

If you download this wget

you can then run

java -cp antlr-4.7.2-complete.jar org.antlr.v4.Tool -Dlanguage=Go -visitor -o gen SmaliLexer.g4
java -cp antlr-4.7.2-complete.jar org.antlr.v4.Tool -Dlanguage=Go -visitor -o gen SmaliParser.g4

this will spit out the following files / code Screen Shot 2019-06-12 at 11 27 49 pm

you should then be able to walk through the smali file / maybe reducing the out of bounds crashes people (including myself) have been experiencing.

For illustration - I successfully used the grammar files to build out parsers / lexers for hundreds of languages with swift

I forget the entry point into class / it changes for each grammar

Here is the code for swift to read a java file you can find in the above repo.

  let textFileName = ""

            if let textFilePath = Bundle.main.path(forResource: textFileName, ofType: nil) {
                let lexer =  Java8Lexer(ANTLRFileStream(textFilePath))
                let tokens =  CommonTokenStream(lexer)
                let parser = try Java8Parser(tokens)

                let tree = try parser.compilationUnit()

                let walker = ParseTreeWalker()
                let java8walker = Java8Walker()
                try walker.walk(java8walker,tree)

            } else {
                print("error occur: can not open \(textFileName)")

The psuedo code would be

  let textFilePath = "/path/Test.smali"

                let lexer =  NewSmaliLexer(ANTLRFileStream(textFilePath)) //this NewSmaliLexer exists 
                let tokens =  CommonTokenStream(lexer) /// ?? there should be a method to do this
                let parser = try NewSmaliParser(tokens)

                let tree = try parser.compilationUnit() // maybe ToStringTree?

                let walker = ParseTreeWalker() // Here as the lexer / parser reads - you can hook in to translate stuff. 
                let java8walker = Java8Walker()
                try walker.walk(java8walker,tree)

there are other people who have created translation using antlr to do this you may need some help - when I have more time I will circle back.

AlexeySoshin commented 5 years ago

You're right, that approach would be much better, as currently I support only a very limited amount of instructions. Will look into it.

8secz-johndpope commented 5 years ago

vscode has smali syntax highlighting could this help?

if you surface any work in a new feature branch - I'm happy to take a look

AlexeySoshin commented 5 years ago

@8secz-johndpope Thanks for getting back with this issue :) Took a look at it, but it's actually more confusing, since it's based on regexes. Planning to make another branch for antlr this week, per your suggestions.