Velocidex / go-ese

Go implementation of an Extensible Storage Engine parser
Apache License 2.0
26 stars 11 forks source link

Reading NTDS.DIT exhibits various problems #21

Open lkarlslund opened 7 months ago

lkarlslund commented 7 months ago

Reading a recent NTDS.DIT dump has surfaced several problems in the otherwise brilliant library you've created. Not sure how best to report this, but I'm attaching a lab dump of GOAD from Orange Cybersecurity which doesn't contain any secrets, and my observations from it.

The dump was made using NTDSUTIL / activate instance ntds / ifm / create full c:\temp - so there shouldn't be any DB corruption or similar problems with it.

Dumping sd_table there are multiple rows where the actual "sd_value" is incorrect, it's returned as 4 bytes not the entire data. Here is an example using ESEDatabaseView to show record 72 and 78 - using go-ese record 72 is returned correctly but the sd_value of record 78 is returned as "24000000".

I also suspect that some records are returned with corrupted sd_value, as I can not parse them as security descriptors, but I haven't had time to dive deeper into this.

image

Dumping datatable it shows that all ATTn fields are marked a multivalue (8), but almost all of what is returned using go-ese are not slices, just singular values. I found this as the ATTc0 attribute should return multiple integer values in most cases.

ntds.zip

scudette commented 7 months ago

Thanks for reporting this. The reason it is not working well is because the sd_value column is of type LONG BINARY which this parser does not support yet. Long values are quite complex to implement and not very well documented. Some libraries do support it like libesedb so it is possible to implement that but it is quite a large task.

Libesedb appears to work with this sample but the code for that library is quite difficult to follow and poorly documented so it might take a while to understand what it is actually doing to extract the long values.

scudette commented 7 months ago

Looking further into it I discovered that Microsoft has recently published the source code for ESE here https://github.com/microsoft/Extensible-Storage-Engine/ which makes it a lot easier to understand as we dont need to reverse engineer the format any more.

While libesedb has come a long way with reversing the format by looking at the source code there are a number of things missed. We probably should rebase this project on the Micorosft source code now that it is available (ie name the variables same as the MS source).

I will spend some time reading the source code to see if we can figure it out.

scudette commented 7 months ago

Should be fixed by #22 at head. Please test if you have the time

lkarlslund commented 7 months ago

Brilliant, reading long values works as intended here now.

Do you want to keep this open about the multivalue fields only reading the first value, or should I make a new issue just for that?

Thanks for the fix so far - impressed by your speed :-)

scudette commented 7 months ago

So there are a couple of things to do still -

  1. Multi values are not implemented (we only get the first value). I dont think libesedb or nirsoft emit multiple values for any of the ATTn fields? Can you point to a record which has multiple values so we can use it as a sample?

  2. With long values it is possible to store very large objects (larger than 32k page size) by dividing it into segments but we only get the first segment. I think I know how to get all the segments but I dont have a sample with values like that.

So it would be nice to implement those additional features but I dont have a sample that I can test it with. I guess we could create a sample db with the API but it seems pointless if we dont have solid forensic use cases for these features maybe the investment is not worth it?

lkarlslund commented 7 months ago

Most of the records in the NTDS.DIT I sent to you should have multiple values in the ATTc0 fields in the datatable table