mandiant / GoReSym

Go symbol recovery tool
MIT License
498 stars 62 forks source link

Refactor Moduledata and Type Version Parsing #55

Open stevemk14ebr opened 1 month ago

stevemk14ebr commented 1 month ago

The logic to parse the moduledata and types is split by version as the underlying structures change every few versions. Originally this was done by creating a Go structure for each version and the marshalling the raw bytes into the appropriate structure by version. Over time this has gotten quite hard to maintain. Refactor this logic, consider if a generic function can be written somehow to avoid the duplicative switch statements.

https://github.com/mandiant/GoReSym/blob/cc91ae744fe2fbab7ef9c0df2f59dac1f8b82643/objfile/objfile.go#L297 https://github.com/mandiant/GoReSym/blob/cc91ae744fe2fbab7ef9c0df2f59dac1f8b82643/objfile/objfile.go#L1176

brigadier-general commented 1 month ago

Potential lead -- "The Go runtime essentially just creates pointers with Go structure types to read the data when it is needed, removing the need for any decoding/unmarshalling." via discussion on pclntab (https://github.com/elastic/otel-profiling-agent/blob/main/docs/gopclntab.md) Maybe a method similar to how loaded PE files get their FirstThunk and OriginalFirstThunks filled out?

stevemk14ebr commented 1 month ago

I think that's probably a much easier to maintain approach. If you move forward with this consider making some const global offsets into the structures named by the field they access so it's clear to see what's being accessed through any pointer math.