Closed williballenthin closed 2 years ago
ntcore says:
ECMA-335 says:
notably, it indicates there are exactly five types of streams. it doesn't seem like this is supposed to be extensible. we should check and see what implementations do when they encounter 1) non-standard streams, and 2) streams with non-ASCII names.
I've constructed a .NET PE file that has a duplicate #US
stream: duplicate-stream.exe
The first stream contains the string "AAAAAAAA" and the second stream contains "BBBBBBBB":
When executed by both mono and M$, the program prints "BBBBBBBB", indicating that the last stream with a given name takes priority:
dnSpy handles this correctly:
We should update the logic here (edit: done in #17):
also here:
I've constructed a .NET PE file that has a non-standard stream: unknown-stream.exe. It has an additional stream named #ZZ
:
When executed by mono, the runtime prints a warning but executes correctly:
When executed by M$, no warning and it executes correctly:
(sidebar: this confirms I have mono and M$ setup separately and correctly. I wasn't actually sure that they didn't use the same backend runtime or something.)
I'll add a test to demonstrate that dnfile can load this stream, too (it can, since dndump.py
works ok above). edit: added in #17.
I've constructed a .NET PE file that has a non-standard stream: invalid-stream-name.exe. It has an additional stream named #\x90\x90
:
Under mono, there's the warning, but the name is not normalized/sanitized during printing:
Under M$, no warning and the behavior is fine:
This demonstrates that stream names can be non-ASCII and still be executed fine by the runtimes. Therefore, dnfile should continue to represent stream names as bytes rather than as strings.
here's the mono handling of unknown stream names: https://github.com/mono/mono/blob/0339fe117122821856d94dcaa0b08ab966b7ecb2/mono/metadata/image.c#L551-L552
note the rather unsafe use of string routines on data that's not guaranteed to be a string.
As noted in https://github.com/malwarefrank/dnfile/pull/17#issuecomment-992040726 are stream names guaranteed to be ASCII? What does the spec say? What do implementations do?