malwarefrank / dnfile

Parse .NET executable files.
MIT License
72 stars 14 forks source link

research: are stream names guaranteed to be ASCII? #19

Closed williballenthin closed 2 years ago

williballenthin commented 2 years ago

As noted in https://github.com/malwarefrank/dnfile/pull/17#issuecomment-992040726 are stream names guaranteed to be ASCII? What does the spec say? What do implementations do?

williballenthin commented 2 years ago

ntcore says:

image https://ntcore.com/files/dotnetformat.htm

williballenthin commented 2 years ago

ECMA-335 says: image

notably, it indicates there are exactly five types of streams. it doesn't seem like this is supposed to be extensible. we should check and see what implementations do when they encounter 1) non-standard streams, and 2) streams with non-ASCII names.

williballenthin commented 2 years ago

I've constructed a .NET PE file that has a duplicate #US stream: duplicate-stream.exe

The first stream contains the string "AAAAAAAA" and the second stream contains "BBBBBBBB":

image

When executed by both mono and M$, the program prints "BBBBBBBB", indicating that the last stream with a given name takes priority:

image image

dnSpy handles this correctly: image

We should update the logic here (edit: done in #17):

https://github.com/malwarefrank/dnfile/blob/5c5c5e00e696ed6f714ea9fa50bb440f540f7342/src/dnfile/__init__.py#L346-L354

also here:

https://github.com/malwarefrank/dnfile/blob/5c5c5e00e696ed6f714ea9fa50bb440f540f7342/src/dnfile/__init__.py#L465-L474

williballenthin commented 2 years ago

I've constructed a .NET PE file that has a non-standard stream: unknown-stream.exe. It has an additional stream named #ZZ:

image

When executed by mono, the runtime prints a warning but executes correctly:

image

When executed by M$, no warning and it executes correctly:

image

(sidebar: this confirms I have mono and M$ setup separately and correctly. I wasn't actually sure that they didn't use the same backend runtime or something.)

I'll add a test to demonstrate that dnfile can load this stream, too (it can, since dndump.py works ok above). edit: added in #17.

williballenthin commented 2 years ago

I've constructed a .NET PE file that has a non-standard stream: invalid-stream-name.exe. It has an additional stream named #\x90\x90:

image

Under mono, there's the warning, but the name is not normalized/sanitized during printing:

image

Under M$, no warning and the behavior is fine:

image

This demonstrates that stream names can be non-ASCII and still be executed fine by the runtimes. Therefore, dnfile should continue to represent stream names as bytes rather than as strings.

williballenthin commented 2 years ago

here's the mono handling of unknown stream names: https://github.com/mono/mono/blob/0339fe117122821856d94dcaa0b08ab966b7ecb2/mono/metadata/image.c#L551-L552

note the rather unsafe use of string routines on data that's not guaranteed to be a string.

and M$: https://github.com/dotnet/runtime/blob/110282c71b3f7e1f91ea339953f4a0eba362a62c/src/libraries/System.Reflection.Metadata/src/System/Reflection/Metadata/MetadataReader.cs#L239-L265