das-labor / panopticon

A libre cross-platform disassembler.
https://panopticon.re
GNU General Public License v3.0
1.43k stars 80 forks source link

Embed a scripting language #103

Open flanfly opened 8 years ago

flanfly commented 8 years ago

Embedding a scripting language to allow structure definitions as well as ad-hoc ISA experimentations is on my list for the future (tm). There are multiple unsolved issue with this:

  1. Which language. See below.
  2. Do we want to serialize the scripts in the on-disk format or just save the results?
  3. Security. In case we save the scripts, how do we prevent them from attacking the user?
  4. Where lives the interpreter? Is it a pure frontend issue or will it be part of the Panopticon library?

Languages I looked at:

Javascript Available through the already present Qt dependency. Would work seamlessly with the QML frontend. Security isn't much of an issue as JS can't access the system.This would fuse front- and backend (or at least make QtScript a dependency of the library part). Also there are no libraries/modules around reverse engineering we could build upon.

Ruby There are multiple Ruby interpereter available. Some like mruby are fairly simple and easy to embed in existing applications. Ruby has lots of libraries that could help to make Panopticon more powerful like Metasploit and BinData. It's an open questions how to allow communication between front- and backend. Also I don't know how Ruby does Sandboxing. This is mandatory as Ruby can open files and sockets.

Pure and Smalltalk Great little languages that combine the downsides of Ruby and Javascript with none of the upsides.

Python I didn't have the time yet to look into this. What's interesting is that many other reverse engineering project like IDAPython and Medusa use it. Probably similar to Ruby.

hardliner66 commented 8 years ago

Just some random input: If you plan to use the script engine only for the disassembly, you should definitly go for something like JS/Python.

If you you plan to add a debugging engine, maybe something like Jancy could be used, because it has the ability to work with pointers.

If you want to make it more usable for reversing challanges (e.g.: CTF), a embedded constraint solver would definitely nice to have. (binary analysis with a constraint solver

But I think you shouldn't try to solve all of this problems at once. it would definitly be a better choice, to create the necessary abstractions and provide a plugin system, so that the needed scripting environments could be created on top of that.

jeandudey commented 7 years ago

What about using cretonne? and a have something like this:

Some script language -> cretonne -> JIT Compilation
sphinxc0re commented 7 years ago

How about Lua? Luas ability to be easy to embed into arbitrary projects, its speed and its flexibility in case of language paradigms makes it perfect for this.

flanfly commented 7 years ago

Hey @jeandudey, I had a look at cretonne, but it says "This is a work in progress that is not yet functional.". That's not really confidence-inducing. Also, I'm mostly concerned about the language itself instead of the implementation.

I do not have much experience w/ Lua. I know it's meant for embedding, so that's good. I do not know the state of its Rust bindings and how the library front looks like.

hardliner66 commented 7 years ago

Just start with a plugin api. Create multiple APIs for disassembly, gui, debugging, etc. Export the needed functions with extern C so other languages can be used.

After that, implement some simple plugin to get a basic script language (e.g.: https://github.com/jonathandturner/rhai).

If someone now want's language X to be supported, they are free to write a plugin which exposes the API to language X.

Now the 4 points mentioned become obsolete:

1) Any language can be supported. User can use his favorite, if a plugin exists, or create his own plugin to support his favorite language.

2) I'm not entirely sure what you mean by that. A script engine doesn't serialize scripts. But normally they are stored on disc. Anyway, with a plugin system, this becomes an implementaion detail.

3) Same as 2, what do you mean by "attacking the user"? I think you mean something like: is it possible for a script to delete a users files. In this case this is not the responsability of the script engine. Is the responsibility of the user to verify their scripts. You wont get good security without sacrificing much of the usability anyway (e.g.: file io to create temp files or read configs, etc.).

4) Inside the plugin for the corresponding api. This shouldn't be a concern of the host application.

I can help you design a basic api if you want.

flanfly commented 7 years ago

I don't want a plugin infrastructure. A project that does this already exists: radare2. There are good reasons to have plugins but I have decided to leave this feature out in favor of better integration.

The purpose of the scripting language is to allow ad-hoc adaptions for specific binaries. For example writing an disassembler/lifter for virtualized code, reverse custom made obfuscations or add a parser for a proprietary file format. For this the scripting language must be well integrated into the rest of the application via a REPL inside the GUI and scripts must be saved alongside the disassembly. When you support 10 languages suddenly Panopticon save files only work when you have the favorite scripting language of the files author installed. Also, this is why scripts must be serializeable and (ideally) sandboxed. I'm willing to ignore the sandboxing, but there must be a way for the user to disable execution of untrusted scripts. Otherwise we end up in a situation like we have with Word macros.

hardliner66 commented 7 years ago

You can (and should) seperate the save files from the rest. Just provide an API for that.

I understand the purpose of the scripting language, but there are already libraries which can do some of this things. And with "only" scripting support, you are limiting the usage of these libraries.

Same goes with performance critical code. If I want to add some functionality which is cpu intensive, I might want to precompile somthing with C to get the job done. If I have a good API, I can do this. If not, I'm limited by the capabilities of the script engine.

For this the scripting language must be well integrated into the rest of the application via a REPL inside the GUI and scripts must be saved alongside the disassembly.

No it doesn't. If you have a good interface for the plugins, than thats all needed and the plugins can take care of the integration of the engine.

I'm willing to ignore the sandboxing, but there must be a way for the user to disable execution of untrusted scripts.

This is an disassembler. Anyone who uses this, should know that they should not execute arbitrary scripts without looking into it. There is not really a need for this.

Otherwise we end up in a situation like we have with Word macros.

Only that Word provides an interface, so that it can use and can be used from other programming languages. You can even write your own custom script engine for it, if you want. It isn't a really good system (and slow as hell because of com) but I can use it as I like and integrate features, that the developers didn't even think were possible.

sphinxc0re commented 7 years ago

Lua makes it easy to black or white list functions so that you have a kind of sandboxing