Open martijnthe opened 7 years ago
Yes these features definitely needed. To make things simple probably the debugger itself should serialize the data rather than adding new code paths. Perhaps we could create a serializer debugger client.
I think the most difficult aspect will be losing the byte code execution from ROM since enabling a breakpoint requires writable memory.
@zherczeg To make things simple probably the debugger itself should serialize the data rather than adding new code paths
Sorry I don't understand what is "add new code path" here.
The current debugger design is sending everything to the client through a connection and the client organizes the data. If we would want to serialize data into a file we need a separate implementation since the file obviously need some kind of structure (format). Simply dumping data would probably cause issues. If we would have a serializer client (a python script) it could organize the data, and write it into a file.
Consider the following source code:
function f()
{
function g() { return 1; }
function g() { return 2; }
function g() { return 3; }
function g() { return 4; }
return g(); // result is 4
}
Currently the debugger (parser) sends all g
functions to the client, and also sends byte-code-free for the first three g
functions, since they are unused and the memory can be freed. Parser is not optimized for supporting such a badly written code and detecting unused functions early.
In case of a file output we probably would need to construct a tree in memory with nodes, and serialize this tree after the parsing is done. This would require a lot of new code (managing the tree, inserting/ deleting nodes, etc.).
@zherczeg I see, we didn't change the code of debugger, but let the a specific client to generate the serialized debug_info. Am I right?
If we would have a serializer client (a python script) it could organize the data, and write it into a file.
I don't really like the idea of having a separate client/python script for this. Separating the parsing and gathering of debug info makes things complicated to use and build upon. Also, I don't like the idea of having additional dependency to python to be able to generate debug info for snapshots.
For context, a story from the past: at Pebble, to generate snapshots, we had used Emscripten to cross-compile a version of the JerryScript CLI to a stand-alone JavaScript program. This was very useful because this made this "snapshot compiler" self-contained and more or less platform-agnostic. Therefore is was very easy to run the "snapshot compiler" in a variety of environments (Node.js, browser, iOS JavaScriptCore, etc...).
Adding a separate python script only to generate the debugging info would block the use case I just described.
Idea: a serialization "client" could also be written in C and have alternative implementations of the jerry-debugger-ws.h
interfaces. This alternative implementation would do the gathering, organizing of the data and finally the serialization. (Renaming the interfaces and perhaps adding a thin abstraction layer there would probably make sense at that point.)
Re. redefining functions: nice edge case indeed.
@martijnthe yes, that is a good idea. A special debugger server port could do this which process the data rather than transmitting it.
I think the most difficult aspect will be losing the byte code execution from ROM since enabling a breakpoint requires writable memory.
I understand that the current implementation requires this, because enabling a breakpoint is implemented by flipping a "disabled breakpoint" opcode to an "enabled breakpoint" opcode.
That said, this could be implemented differently, no? I can imagine an implementation where the bytecode is read-only, a list of breakpoints info is kept in RAM and the VM compares the program counter against that list.
Stepping line-by-line is probably a bit more involved, but not impossible to do. For comparison, I know that gdb has an implementation for "step line" where, under the hood, the gdb client will repeatedly send "step instruction" to the target, until the program counter matches what the client thinks is the program counter at the beginning of the next line.
Hm, yes maintaining a list of virtually enabled breakpoints could be possible. The number of such enabled breakpoints would be obviously limited.
Waiting for a network round trip after each byte code is very slow. Also debugger should not slow down the execution when it is part of the binary but not enabled at runtime.
Waiting for a network round trip after each byte code is very slow.
This would only happen for "step line". But sure, I mentioned it just to illustrate there are existing solutions that have proven to be usable in practice.
Also debugger should not slow down the execution when it is part of the binary but not enabled at runtime.
Agree. Disabling the VM's debugger capabilities can be done in the same way that it is done right now. I don't think the idea of a list of breakpoints implies a change to that.
One more thing: when debugging is enabled, certain optimizations are disabled (memory consumption is bigger). Hence ideally there should be two byte code instances one with debugging support and one without it.
@zherczeg what are your plans w.r.t. to these requests? Are you working on any of this? If not, @jiangzidong / me will start working on it soon.
You can work on this.
cc @HBehrens can you share your thoughts about pros/cons using the sourcemap format?
@zherczeg
I think the most difficult aspect will be losing the byte code execution from ROM since enabling a breakpoint requires writable memory.
The current implementation of snapshot_load_compiled_code
(called by jerry_exec_snapshot
) will copy the snapshot in ROM into jerry heap, so the bytecode is still writable in vm. Am I right?
No. If you don't pass the copy flag, it will only copy the header, and creates a special byte code, which points to the start of the actual byte code in ROM. The key feature of snapshot is that the byte code itself is running from ROM.
Oh, sorry that I missed that.
jerry-snapshot.c L394
memcpy (instructions_p + code_size + 1, &real_bytecode_p, sizeof (uint8_t *));
vm.c L678
memcpy (&byte_code_p, byte_code_p + 1, sizeof (uint8_t *));
Thanks for your explanation.
@jiangzidong and me just had a quick chat to identify pieces of work to enable debugging snapshots (most likely not the complete list...;) ):
Dump to file: implement a new jerry_debugger_send
in jerry-debugger-dump
. This alternative implementation will write the messages to a local file instead of sending it to a client. The format of this file will be closely follow the protocol messages that are sent during parsing.
As a work-around for bytecode needing to be in writable memory in order for it to be debugged, we'll pass copy_bytecode=true
to jerry_exec_snapshot()
before running. See point 5 for the actual fix.
Add a way to load the debug info from a dump file into the debugger client. Q: should this file loading capability be added to the client(s)? I'm leaning to "yes", but having 2 clients is 2x the work... Thoughts on reducing this duplication? @zherczeg @polaroi8d? I wonder if it makes sense to remove the Python one, extract the .js out of the HTML into a separate file and add a Node.js based CLI client (instead of the Python one) that uses the same .js that the HTML uses.
Add a version to the debugging protocol (and file format), so that we can at least detect a version mismatch and tell the user of debugger to use a debugger client that supports version XYZ instead. Later we'll worry about backward compatibility of debugger clients for older JerryScript debug protocol versions. I think it's an important thing to tackle, but probably it's better to address this later when direction of the protocol is more clear.
Mapping: need to move away from using physical addresses of bytecode in the debugging information. This does not work when the physical addresses are not known when the debug info is generated. Solution direction: TBD.
Breakpoints requiring bytecode is writable memory. Solution: maintain linked list of enabled breakpoints. When VM encounters disabled breakpoint instruction, check if it is in the list of enable breakpoints.
Longer term: bridge to Chrome Debugger Protocol. Existing IDEs and tooling support connecting to debug servers that talk this protocol, so I think it makes sense to support it too.
I would like to be able to use the debugger with code loaded from snapshots. Because of device limitations, it may not be possible to parse the source on the device itself. The current debugging approach assumes that the device itself is capable of parsing the JS source.
To support debugging of snapshots, I think we need: