py4n6 / pytsk

Python bindings for The Sleuth Kit (libtsk)
Apache License 2.0
92 stars 24 forks source link

multiple observations of the pytsk/libtsk attribute interface #79

Open joachimmetz opened 3 years ago

joachimmetz commented 3 years ago

While working on https://github.com/log2timeline/dfvfs/issues/504 some observations about the pytsk/libtsk attributes interface:

joachimmetz commented 3 years ago

Unclear why pytsk does not expose at least 1 attribute for the ext4 test image (the second attribute is not shown due to https://github.com/sleuthkit/sleuthkit/issues/2487)

istat test_data/ext4.raw 13
inode: 13
Allocated
Group: 0
Generation Id: 1673495854
uid / gid: 1000 / 1000
mode: rrw-rw-r--
Flags: Extents, 
size: 53
num of links: 1

Extended Attributes  (Block: 1331)
user.myxattr=My extended attribute

Inode Times:
Accessed:   2021-07-22 16:07:32.841610817 (CEST)
File Modified:  2021-07-22 16:07:32.841610817 (CEST)
Inode Modified: 2021-07-22 16:07:32.846610831 (CEST)
File Created:   2021-07-22 16:07:32.841610817 (CEST)

Direct Blocks:
1332 
joachimmetz commented 3 years ago

pytsk uses tsk_fs_file_attr_get_idx https://github.com/py4n6/pytsk/blob/eeb7b69845668e52390d1fdfe9d12806b43f302f/tsk3.cpp#L562

which calls tsk_fs_attrlist_get_idx https://github.com/sleuthkit/sleuthkit/blob/0239c5934e348699d0be38f694fb6320252a91fc/tsk/fs/fs_file.c#L268

https://github.com/sleuthkit/sleuthkit/blob/0239c5934e348699d0be38f694fb6320252a91fc/tsk/fs/fs_attrlist.c#L370

Looks like ext2fs_load_attrs is the main libtsk function for ext extended attributes https://github.com/sleuthkit/sleuthkit/blob/develop/tsk/fs/ext2fs.c#L1984

which look like it is invoked from tsk_fs_file_attr_check https://github.com/sleuthkit/sleuthkit/blob/0239c5934e348699d0be38f694fb6320252a91fc/tsk/fs/fs_file.c#L235

Which is invoked by tsk_fs_file_attr_get_idx

So it looks like pytsk is invoking the right API function

joachimmetz commented 3 years ago

With some tweaking of the pytsk code to remove sanity checks, it looks like libtsk claims to have only 1 attribute

import pytsk3

img = pytsk3.Img_Info('dfvfs/test_data/ext4.raw')
fs = pytsk3.FS_Info(img)
f = fs.open_meta(inode=13)
[a for a in f]

ext2fs_load_attrs calls ext4_load_attrs_extents https://github.com/sleuthkit/sleuthkit/blob/develop/tsk/fs/ext2fs.c#L1989

which then ends up branching into https://github.com/sleuthkit/sleuthkit/blob/develop/tsk/fs/ext2fs.c#L1901

which does appear to only set the extents of the default data stream, unclear what the libtsk API for getting ext[2-4] extended attributes is at this point, let's see if there is going to be any response from upstream

joachimmetz commented 2 years ago
istat ext2.raw 15
inode: 15
Allocated
Group: 0
Generation Id: 3892545622
uid / gid: 1000 / 1000
mode: rrw-rw-r--
size: 22
num of links: 1

Extended Attributes  (Block: 162)
security.selinux=unconfined_u:object_r:unlabeled_t:s0

Inode Times:
Accessed:   2021-07-22 16:07:32 (CEST)
File Modified:  2021-07-22 16:07:32 (CEST)
Inode Modified: 2021-07-22 16:07:32 (CEST)

Direct Blocks:
515 

But attribute interface returns 2x TSK_ATTR_RUN (offset: 515, size:1 and sparse size:15)

libfsext_data_blocks_read_data: block data at depth: 0:
00000000: 03 02 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ........ ........
00000010: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ........ ........
00000020: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ........ ........

libfsext_data_blocks_read_data: block number at depth: 0                : 515

libfsext_data_blocks_read_data: logical block number                    : 0
libfsext_data_blocks_read_data: physical block number                   : 515
libfsext_data_blocks_read_data: number of blocks                        : 1

Not sure why the attributes interface is adding an additional "run"