Add non-module support to offline traces

derekbruening commented 7 years ago

Split from #1729

What about DGC from a JIT, or vsyscall?

We need to store the type (read/write/prefetch*) and size for a memref, and the size of every single ifetch.

For memrefs, we may have to use double entries, with the first one having an escape type and the address and the second having the full type and size.

For ifetch, perhaps we can fit multiple sizes in one entry. Modidx will be a sentinel, so we have the 45 modoffs bits. We will want 4 bits for each instr size (ok to assume 16-byte max instr, even though technically there can be 17 and an illegal instr could be even bigger) so we can fit the sizes of 11 instrs there. We'll zero the rest.

The vdso shows up as a module but is harder to decode. We have two choices for vdso: 1) Treat as non-module and record type info in the trace. The tracer will need to look for "[vdso]" which will require changing drmodtrack. 2) Assume raw2trace is run on a machine w/ the same vdso code and try to decode from its own vdso via special code in raw2trace.

derekbruening commented 7 years ago

3) Dump full contents of vdso somewhere and store that path in the module list

derekbruening commented 6 years ago

fd63caa7286f2f0dd239cbd8de07a264d5a4e82c from PR #2706 added vdso content dumping and post-processing support.

General DGC is still unsupported for offline.

prasun3 commented 2 years ago

Can you give me some pointers as to what would be needed to get this working?

Also, should we indicate (via warnings etc) that jitted code is not supported in offline traces. Right now the only indication I can see is this message in debug builds:

SYSLOG_INTERNAL_WARNING_ONCE("writing to executable region.");

derekbruening commented 2 years ago

That writing to executable region is from the core, not the tracer. It does seem surprising that the tracer doesn't print any warning, and the post-processor has a verbose-3 warning is all:

$ git grep -A 5 '#2062'
clients/drcachesim/tracer/instru_offline.cpp:        // FIXME i#2062: add non-module support.  The plan for instrs is to have
clients/drcachesim/tracer/instru_offline.cpp-        // one entry w/ the start abs pc, and subsequent entries that pack the instr
clients/drcachesim/tracer/instru_offline.cpp-        // length for 10 instrs, 4 bits each, into a pc.modoffs field.  We will
clients/drcachesim/tracer/instru_offline.cpp-        // also need to store the type (read/write/prefetch*) and size for the
clients/drcachesim/tracer/instru_offline.cpp-        // memrefs.
clients/drcachesim/tracer/instru_offline.cpp-        modidx = 0;
--
clients/drcachesim/tracer/raw2trace.h:            // FIXME i#2062: add support for code not in a module (vsyscall, JIT, etc.).
clients/drcachesim/tracer/raw2trace.h-            // Once that support is in we can remove the bool return value and handle
clients/drcachesim/tracer/raw2trace.h-            // the memrefs up here.
clients/drcachesim/tracer/raw2trace.h-            impl()->log(
clients/drcachesim/tracer/raw2trace.h-                3, "Skipping ifetch for %u instrs not in a module (idx %d, +" PIFX ")\n",
clients/drcachesim/tracer/raw2trace.h-                instr_count, in_entry->pc.modidx, in_entry->pc.modoffs);

That instru_offline comment describes one approach. The memrefs would end up looking like the online ones with full info.

prasun3 commented 2 years ago

Looks like instru_offline.cpp does not have any logging at all. I have changed the verbosity in the post-processor log from 3 to 1.

The change suggested in the instru_offline comment would only provide the instruction length, and only for up to 10 instructions -- is that correct?

If we wanted to get the opcodes, would we have to do something similar to what is done for vdso i.e. save the code to disk and use that for post-processing?

derekbruening commented 2 years ago

The change suggested in the instru_offline comment would only provide the instruction length, and only for up to 10 instructions -- is that correct?

I think it means there would be a separate entry for each set of 10 instructions. (If we wanted to only allow 10 that is also doable by splitting the blocks.)

If we wanted to get the opcodes, would we have to do something similar to what is done for vdso i.e. save the code to disk and use that for post-processing?

Right, the proposal in the comment is only for enough info for cache simulation. For core simulation you would have to either include opcode (or some simulators would want the full encoding?) info with each instruction or as you said save the code separately, but saving the code gets tricky if it's actively modified and different instructions occupy the same addresses at different times. A hybrid approach that saves the code separately just once for written-once code combined with per-entry encoding info for multi-write code might be worth the effort. Combining with hints from the JIT (as in https://dynamorio.org/page_jitopt.html) would be best where possible.

prasun3 commented 2 years ago

saving the code gets tricky if it's actively modified and different instructions occupy the same addresses at different times

Do you have a sense of how common this is?

Regarding saving code, for the vdso case we already have a module, but for jitted code I think we will need to add a new module entry. What do you think?

I didn't see a drmodtrack API that we could use. Would be need to add this support?

derekbruening commented 2 years ago

saving the code gets tricky if it's actively modified and different instructions occupy the same addresses at different times

Do you have a sense of how common this is?

This can happen in multiple scenarios, such as:

An application uses gcc nested functions which place short trampoline code sequences on the stack. As the stack pointer moves around, a new trampoline can easily overlap where an old one used to be.
A JIT reclaims some memory and re-uses it for different code.
Library code is modified. This happens on Windows all the time as various functions are hooked; we're kind of ignoring that today for memtraces.

An application unloading one library and loading a new one at an overlapping address is handled by drmodtrack via a new module entry (but core DR has to treat it like modified code).

As for how common it is: I don't know how common for say today's most popular Java VM's as I have not looked at Java in a long time.

Regarding saving code, for the vdso case we already have a module, but for jitted code I think we will need to add a new module entry. What do you think?

I didn't see a drmodtrack API that we could use. Would be need to add this support?

If we had the annotations pointed at above and knew that a memory region was a JIT region and was only appended to and never modified in the middle, this makes sense: add an entry, save the contents at the end (using the assumption of no changes).

prasun3 commented 2 years ago

Is there a way to detect modified code using either DR or other tools?

We are trying the approach of saving code to a file. This code gets saved in a raw binary format but during post-processing I think the module loader expects an executable format like ELF, so dr_map_executable_file fails. Any suggestions on how to handle this?

derekbruening commented 2 years ago

Is there a way to detect modified code using either DR or other tools?

Xref #409 on DR providing an event when it flushes code due to modifications. DR's two-pronged scheme of page protection and sandboxing is described in "Maintaining Consistency and Bounding Capacity of Software Code Caches" linked at https://dynamorio.org/page_publications.html.

We are trying the approach of saving code to a file. This code gets saved in a raw binary format but during post-processing I think the module loader expects an executable format like ELF, so dr_map_executable_file fails. Any suggestions on how to handle this?

If the data were embedded like for VDSO it would be plain-mmapped with the rest of the file and would not need any new code to handle. If it's separate it could be written with executable file headers; or custom_module_data_t could be expanded; or maybe drmodtrack itself should get in the game and provide something, which would help other uses like drcov -- but drcov's use case mapping it back to sources is murky. Not sure what the best thing is; various other issues as well such as how often storing the code works out (i.e., how often generated code address ranges are unchanged after first execution across the whole run for typical JITs).

DynamoRIO / dynamorio

Add non-module support to offline traces #2062