Question about implementation of index file

Hi, Sorry to bother you! I'm learning tarmac based sw debug and just found this project which is quite useful. However, before fully understanding the implementation, I have a very initial question/confusion about the fundamental implementation of index file and how it being used by other tools. For example, I tried play tarmac-browser, it immediately shows reg and memory info once I jump to certain line of original tarmac file. It seems the reg/memory info of certain tarmac line is stored in the index file so that it can be represented in browser immediately (like, the reg/memory info at that moment of each line in the trace are stored). But in theory, it's almost impossible to store such huge info in the index file (and I see the size of index file looks OK). I'm wondering, if index file only contains intermediate info, then other tools like browser still need kind of processing to get reg/memory info of certain line of trace? If so, what if the size of original tarmac trace is large (for example, GB level), despite of the index generation time, any potential performance risk when browser to represent info? Thank you!

Thanks for asking! You can find documentation of the data structures in the index in a big comment in include/libtarmac/index_ds.hh. See https://github.com/ARM-software/tarmac-trace-utilities/blob/main/include/libtarmac/index_ds.hh.

Yes, the index file really does directly store all the register and memory contents from the whole trace. It's done by a system of binary trees that share nodes with each other as much as possible: the "Node sharing" section of that comment goes into more detail.

The text of the Tarmac trace is not stored in the index file. The index will refer to line numbers and byte positions in the trace file, but doesn't make a complete copy of the file contents. So the browser needs to have access to both files in order to display the trace text on screen. But it doesn't need to refer to the trace file at all in order to figure out what the registers and memory looked like at a given time, or to identify when a particular memory location was last modified.

Hi Simon, Thanks for the reply and sorry to bother you again! I'm looking at your implementation more deeply. Basically there're seqtree, bypctree, memtree and memsubtree in your code. It's a bit easier to understand seqtree and bypctree which are sorted by trace event order and PC. I tried "--seqtree" options with tarmac-indextool, and the dumped physical structure of seqtree helps me understand it pretty well. However, it's a bit difficult for me to understand the following things:

how the nodes of memtrees are shared? I see each node of seqtree points to root of a memtree (so there're many memtrees, right?)
even with "--memtree=offset" to dump one of the memtrees, I don't quite understand which field is used to order the memtree
what is the mem-sub-tree used for?
how the annotation help searching?

I know it's pretty demanding, but if you're free, it's really appreciate that you can help explain my confusion with the example below. I created a fake trace with few events (reg access and mem access, and overwriting happens). Could you kindly explain the whole flow that how the trees looks like during index generation and what happens when I use browser and jump to the last line (where mem overwrite happens). How the latest register values and mem contents are presented very quickly.

0 tic ES    (PC=0x1000)
            R X0 0010       //first time access X0
0 tic ES    (PC=0x1004)
            ST 0x8000 8000  //first time access mem[0x8000]
0 tic ES    (PC=0x1008)
            R X1 0011
0 tic ES    (PC=0x100C)
            R X2 0012
0 tic ES    (PC=0x1010)
            R X3 0013
0 tic ES    (PC=0x1014)
            R X4 0014
0 tic ES    (PC=0x1018)
            R X0 00AA       //overwrite X0=0x00AA
0 tic ES    (PC=0x101C)
            R X5 0015
0 tic ES    (PC=0x1020)
            ST 0x8010 8010
0 tic ES    (PC=0x1024)
            ST 0x8020 8020
0 tic ES    (PC=0x1028)
            ST 0x8030 8030
0 tic ES    (PC=0x102C)
            ST 0x8040 8040
0 tic ES    (PC=0x1030)
            ST 0x8050 8050
0 tic ES    (PC=0x1034)
            ST 0x8000 BEEF  //overwrite mem[0x8000]=0xBEEF

That's right, there are many memtrees, one for each node in the trace that the system can treat separately.

Node sharing

Imagine if you were iterating through the events in the trace one by one, and making a binary tree of the contents of memory, consisting of (start address, length, data) records, sorted by address. Every time you saw a memory write, you'd insert a new record into that tree (and maybe delete one that was already there, or break up an existing one into multiple pieces). So the tree would gradually evolve as you went along.

If you didn't need to rewind time, that would be all you'd have to do. When you'd processed the trace up to time T, your memtree would tell you everything about the memory at time T, and if the user asked "What was at address X?", you'd search for that address in the memtree, and find a record telling you what was there.

But we do need to rewind time, so here's what we do.

At the point where the indexer has got the memtree into the right state to store in a particular seqtree node, we declare that all the memtree nodes we've so far created are locked: nothing must modify them ever again. So the next time we need to insert something into the memtree, whenever the tree insertion algorithm wants to modify a locked node, it instead makes a copy of it, and modifies the copy instead. And that means the node's parent must also be copied (so that the new version can refer to the modified child node instead of the old one), and so on, back to the root of the tree.

So you get a new tree root containing the modified data – but the old tree root is still valid, and still points at the old data.

And since each tree operation takes only O(log n) time, it must only modify O(log n) nodes. (It wouldn't have time to modify any more than that!) So most of the nodes of each version of the memtree will be shared with the previous memtree, and only a handful of them change. But one of those is always the tree root.

Since new nodes are always appended to the end of the index file, we don't need to implement this "locking" by having a flag in each tree node. We just do it by file offset: take a note of the current file size, and then memtree nodes before that point must be copied instead of modifying them in place.

memsubtree

The memsubtree system is there to handle memory reads.

If 0x1234 is written to a location in memory at time T, it means that the memtree must show 0x1234 in that location from time T onwards, and something else before time T. That's easy, in the scheme I've just described.

But what if the first reference to a particular address in the trace consists of a memory read at time T, which observes the value 0x1234 at that address? In that situation, it must have contained 0x1234 all along. So you'd like the memory index to show 0x1234 at that address even before time T, because surely the value was there already, you just hadn't found out yet.

We can't do that by going back and modifying the existing memtrees. There might be thousands of memtree roots that all need to be modified -- it would be far too slow, and also confusing.

So, instead, when a record in the memtree needs to say "we don't know what's in this address range yet", it does it by making a second-layer tree and pointing at the root of that. And that secondary tree is modifiable. So the indexer handles a memory read by checking to see whether the contents of the address refers to an unknown area in a memsubtree, and if so, it fills in that data in the memsubtree. The effect is that that change is "back-dated" to the point where the memsubtree was originally added to the main memtree, which is either at startup, or at the point of a semihosting operation that we expected to overwrite memory without logging it in the trace file.

Hi Simon,

Thanks for the detailed description!! Now it's more clear to me. Here're my understanding about node sharing, could you please help confirm if it's correct? With experiments, I see each node in seqtree has an attribute "Root of memory tree: ", thus there're many memtrees. Let's make an exmaple:

Assuming at T1, T2 and T3, there're 3 reg/memory new events. I can see there're 3 memtrees are generated individually and linked to corresponding seqtree node. And I see the memroot offset of each memtree looks incremental, so I think the storage of memtree info in index file is like: ||-- T1 memtree --||-- T2 memtree with more info --||-- T3 memtree which is larger than T2 --|| I see some un-touched addresses points to the same offset in the 3 memtrees, I think this is your "space saving", right? However, the space to store memtree info itself is duplicated and getting larger, right? Then maybe near the end of simulation, the memtree in last seq node is the largest, right?

Therefore and in addition, I have basically two questions:

I think the backend searching performance is good because of your amazing data structure. Then how about the performance of generating the index file? Have you ever tried processing large tarmac file (e.g. tarmac of kernel boot), what is your experience of processing large tarmac file?
Any risk about the size of index file if the tarmac is large?

BTW, can I get large tarmac file somewhere to try?

Thank you again!!

The whole memtree of the final seq node will likely be the largest, yes. But the new nodes added at that final step will still be small. Your example diagram looks more like

||-- T1 memtree --||-- nodes in the T2 memtree that differ from T1's --||-- nodes in T3 memtree that differ from T2's --||

and in each section, the number of nodes is logarithmic in the overall number of nodes in the tree.

Asymptotically, indexing performance should be O(n log n). So, slower than linear time, and getting gradually slower and slower on large files, but still not by very much.

The size of the index file is also O(n log n), because to a first approximation, the work done during indexing is all creating new nodes of various trees to append to the index file.

You're right that the index file is potentially quite large. O(n log n) space can get surprisingly big. I've never personally tried this system on a trace as large as a Linux kernel boot; my own uses of it have always been debugging standalone embedded programs much smaller than a kernel.

(So I don't have a really big trace file for you, I'm afraid. I can only suggest finding a large program of your choice, and running it on a model that can produce Tarmac output, such as an Arm Fast Model.)

I've occasionally thought that it would be nice to have a choice of index data structures, trading off space usage against query speed, by means of having multiple data-structure systems in the code and an identifier field in the index file saying which one was in use. Then for a really big trace you could fall back to a more compact indexing system, at the cost of queries taking longer (because maybe searching the simpler index takes O(n) time, or something). But that's a lot of work that I haven't found time to do!

Yet another possible indexing system for memory contents would be to build up a list of memory writes of fixed size, e.g. in the form (address, 4 bytes of data, timestamp). They would naturally be ordered by timestamp. But then, when the index file is finalized, you sort them into order by the key (address, timestamp). And then you can retrieve the contents of this address at that time by binary search (and the element just before the query key will be the most recent write to the address). That system is linear in space usage, and still permits log-time lookups, but has some downsides in the fine detail (e.g. the division of memory into fixed-size chunks), and also I'm not sure how it would support the retroactive effect of tracing a memory read.

I'm closing this inactive issue, since it was more of a query than a bug report, and I think it's all been answered.

ARM-software / tarmac-trace-utilities

Question about implementation of index file #8

Node sharing

memsubtree