The parser design is generic, and it should be able to support any kind of type and event.
Current implementation is incomplete, with a focus in supporting the types and events generated by async-profiler. The implementation can be easily extended to support more types and events, see the Design section for details.
While the parser is built on top of the io.Reader
interface, it doesn't process the input sequentially: the Metadata and Checkpoint events are processed before the rest of the events.
This means that whole Chunks are stored in memory. Chunks are currently processed sequentially, but this is an implementation detail and they may be processed concurrently in the future.
A reader package takes care of the wire-level details, like (un)compressed integers and different string encodings (not all of them are currently supported).
A parser package processes the chunks and returns the events of each of them. In order to do so, it processes the Metadata event and uses that information to parse the rest of the events. The constant pool is parsed in two passes: in the first pass the inline data is processed and the recursive constant pool references are left unprocessed. On the second pass the constant pool references are resolved.
Finally, the rest of events are processed, using the resolved constant pool data when constant pool references appear.
The parser relies on finding an implementation for each of the type and event that needs to be parsed. These implementation differ slightly between types and events:
Parseable
interface that needs to be implemented by types and eventsResolvable
interface that needs to be implemented by types.To add support for new types and events a new data type that satisfies the corresponding interfaces needs to be added, and then included in either the types
or events
tables (see types.go and event_types.go)
The parser API is pretty straightforward:
func Parse(r io.Reader) ([]Chunk, error)
Parser returns a slice of chunks, for each chunk call chunk.Next and then read chunk.Event.
It should be used like this:
chunks, err := parser.Parse(reader)
for _, chunk := range chunks {
for chunk.Next() {
chunk.Event // it may be reused, copy it if you need the event after another call to Next
}
err = chunk.Err()
if err != nil {
panic(err)
}
}
Check the main package for further details. It can also be used to validate the parser works with your data and get some basic stats.
The parser is still at an early stage, and you should use it at your own risk (bugs are expected). The current (non-exhaustive) list of pending work includes:
Help with these pending tasks is more than welcome :)