fox-it / dissect.target

The Dissect module tying all other Dissect modules together. It provides a programming API and command line tools which allow easy access to various data sources inside disk images or file collections (a.k.a. targets).
GNU Affero General Public License v3.0
41 stars 44 forks source link

target-dump is not able to all filesystem entries in ESXi disk images #427

Open mnrkbys opened 10 months ago

mnrkbys commented 10 months ago

I have tried to dump all filesystem entries in ESXi images with the following command.

target-dump -f walkfs -o X:\out X:\test_esxi.E01

However, filesystem_entry.jsonl has 606 lines. This number of entries is clearly too low. Also, there are only 13 files with "fstype": "vmfs".

Is there any other way to dump all file entries?

mnrkbys commented 10 months ago

The subcommand od target-fs can output all filesystem entries, but it does not output MAC times. I would like to check for them for all files.

target-fs X:\test_esxi.E01 walk /
pyrco commented 10 months ago

Hi @mnrkbys,

Thanks for reporting this issue.

It is interesting that the walkfs plugin reports less files than something like target-fs -f walk /, as I would expect it to be the other way around. The reason is that the walkfs plugin goes over all detected filesystems, while target-fs -f walk / just walks the abstracted root filesystem (which may not have all detected filesystems mounted).

Would it be possible for you to share the (redacted) output of

target-info X:\test_esxi.E01

? Of most interest are the reported Disks and Volumes.

One thing you might try is:

target-query -f walkfs X:\test_esxi.E01 | rdump -m jsonlines > tq-walkfs.jsonl

Note that you should execute this in the classic cmd window, as pipes etc. don't work as expected in PowerShell. Though I don't expect it to give different results as it should do the same as the target-dump -f walkfs command.

BTW we are in the process of changing this walkfs behaviour, so the walkfs plugin and things like target-fs walk / will have a similar view of the target including all detected filesystems.

Schamper commented 10 months ago

I probably know why this is happening. As part of loading an ESXi system, the "filesystem" is rebuild and mapped into a "root" filesystem layer. But this layer is not added back into target.filesystems, which is what walkfs uses.

https://github.com/fox-it/dissect.target/blob/main/dissect/target/plugins/os/unix/linux/esxi/_os.py#L98

@pyrco I expect this to be resolved with the upcoming walkfs changes, but perhaps we could look into adding this filesystem to target.filesystems in the meantime?

Schamper commented 10 months ago

@mnrkbys as a temporary workaround, you could use the following Python script (modified to the output of your liking):

import sys
from dissect.target import Target

t = Target.open(sys.argv[1])
for path in t.fs.path("/").rglob("**"):
    stat = path.stat()
    print(path, stat)

Use as python script.py X:\test_esxi.E01. The path variable is a regular pathlib.Path object, and the stat variable is a regular stat_result, so you can format the output to your liking.

You could also add target.filesystems.add(target.fs) here to "fix" walkfs.

mnrkbys commented 10 months ago

Hi @pyrco, Thanks for your reply. I have tried target-info command.

>target-info X:\test_esxi.E01
2023-10-31T01:19:16.664453Z [warning  ] <Target \test_esxi.E01>: Can't identify filesystem: <Volume name='part_1f804000' size=115326464 fs=None> [dissect.target.target]
2023-10-31T01:19:16.666453Z [warning  ] <Target \test_esxi.E01>: Can't identify filesystem: <Volume name='part_38400000' size=2684354048 fs=None> [dissect.target.target]
2023-10-31T01:19:16.776454Z [error    ] Unable to import dissect.target.plugins.filesystem.yara [dissect.target.plugin]
<Target \test_esxi.E01>

Disks
- <Disk type="EwfContainer" size="299966445568">

Volumes
- <Volume name="part_00008000" size="4161024" fs="FatFilesystem">
- <Volume name="part_d8400000" size="4293918208" fs="FatFilesystem">
- <Volume name="part_1d8300000" size="292044436480" fs="NoneType">
- <Volume name="part_00404000" size="262127104" fs="FatFilesystem">
- <Volume name="part_0fe04000" size="262127104" fs="FatFilesystem">
- <Volume name="part_1f804000" size="115326464" fs="NoneType">
- <Volume name="part_26604000" size="299875840" fs="FatFilesystem">
- <Volume name="part_38400000" size="2684354048" fs="NoneType">
- <Volume name="part_1d8300000" size="291789340672" fs="VmfsFilesystem">

Hostname       : XXXXXXXX
Domain         :
Ips            : xxx.xxx.xxx.xxx
Os family      : esxi
Os version     : VMware ESXi 6.7.0-3.89.15160138
Architecture   : None
Language       :
Timezone       : None
Install date   : 1970-01-01 00:00:00+00:00
Last activity  : None

And, the result of target-query + rdump has been the same of target-dump -f walkfs command.

mnrkbys commented 10 months ago

Thanks for sharing the sample code, @Schamper . I've tried it. Unfortunately, it will output only 758 file entries.

Ultimately, I modified the walk() function in fs.py like below to extract the output I want.

def walk(t, path, args):
    for e in path.rglob("*"):
        # print(str(e))
        name = str(e)
        entry = e.get()
        lstat = entry.lstat()
        symlink = f" -> {entry.readlink()}" if entry.is_symlink() else ""
        utc_ctime = datetime.datetime.utcfromtimestamp(lstat.st_ctime).isoformat()
        utc_mtime = datetime.datetime.utcfromtimestamp(lstat.st_mtime).isoformat()
        utc_atime = datetime.datetime.utcfromtimestamp(lstat.st_atime).isoformat()
        print(f"{utc_ctime} {utc_mtime} {utc_atime} {name}{symlink}")
pyrco commented 10 months ago

Hi @pyrco, Thanks for your reply. I have tried target-info command.

...

And, the result of target-query + rdump has been the same of target-dump -f walkfs command.

Thanks for the info. The similar results were as we expected. Due to the reason @Schamper mentioned this is probably the one example where targetdump -f walkfs will return less results than target-fs walk. And as mentioned we are in the process of addressing this, so bear with us for a bit :-)

I see you submitted a PR for some target-shell improvements in #431 which I guess will help with your usecase. Once you accept the CLA, we can review the PR.