log2timeline remove partition offset options

joachimmetz commented 5 years ago

partition offset options for log2timeline.py are arcane an not user friendly, let's remove them

joachimmetz commented 5 years ago

Removed --offset, --ob, --sector-size

https://github.com/log2timeline/plaso/pull/2233

Keeping --partition besides --partitions for now.

joachimmetz commented 5 years ago

This has been completed

kero99 commented 5 years ago

oh no!!! @joachimmetz , --offset option was soo good for unattended scripting, now we will need to map offset to a loop device first. The unfriendly options are not always bad :(, i don't understand this clean up :((((((((((((

joachimmetz commented 5 years ago

@kero99 you still have --partition or --partitions which is likely even easier to script, since you don't need to determine the offset any more and just indicate which partition you want to process.

The unfriendly options are not always bad :(, i don't understand this clean up :((((((((((((

So I don't really understand this strong reaction.

kero99 commented 5 years ago

when I analyze a disk with mmls for example, I know exactly what partition I want to analyze with plaso thanks to its offset ... the partition number is not as ... exact... i think

joachimmetz commented 5 years ago

the slot number you see in the mmls output relates to the partition. If you can determine the partition based on offset you can do this based on slot number (or partition number). The problem offset does not work with more complex scenarios, and also it leads to a lot of confusion because mmls shows the start sector number and not byte offset.

kero99 commented 5 years ago

Thanks you for fast response @joachimmetz =),

In this example the slot is 002 ¿p2?, but for plaso i will set --partition 1.

      Slot      Start        End          Length       Description
000:  Meta      0000000000   0000000000   0000000001   Primary Table (#0)
001:  -------   0000000000   0000000062   0000000063   Unallocated
002:  000:000   0000000063   0020948759   0020948697   NTFS / exFAT (0x07)
003:  -------   0020948760   0020971519   0000022760   Unallocated

Sorry, I do not understand the correlation :(

For scripting is very easy get sector size, sector offset and calculate the offset but im not sure about partition number :S

joachimmetz commented 5 years ago

I've changed the formatting so you can see that 002: is not the slot number, but number of the "part info" (not partition) as sleuthkit calls this https://github.com/sleuthkit/sleuthkit/blob/develop/tsk/vs/mm_part.c . Note that it overloads the term partition as well.

The value "000:000" relates to the first primary partition, "000:001" to the second primary partition, "001:000" to the first extended partition, etc.

For log2timeline.py just count the number of slots that refer to actual partitions.

So the mmls slot terminology does not correspond with that of fdisk and gdisk. Which would refer to "000:000" as slot 1 and "001:000" slot 5.

We've seen cases with mixed MBR/GPT partition tables where the output of mmls is plain wrong (offset and/or sector size), same with edge cases like images created with Norton Ghost. This means for us that the offset is not reliable in automation.

@kero99 could you give us some context why you have mmls list the partition table before running log2timeline?

kero99 commented 5 years ago

Of course, I use some scripts in forensics investigations to automate the most typical things that i do. For example:

Mount evidence (mount -o offset=XXX,....)
Filesystem timeline (fls)
Regripper
Extract files by types (icat)
Carving (foremost, scalpel, photorec...)
Parse some logs...
Plaso by artifacts with my custom filters (Yes! i love plaso)
Other things

In this example, i need offset for 1, 2, 3 (mount evidence related), 4, 6 (mount evidence related), 7 (until last version :( ), 8

As you can see, offset is required by many utilities.

joachimmetz commented 5 years ago

seeing you're using artifacts/custom filters why not run log2timeline.py with --partitions all? then you do not need to determine the partition or offset.

Also if you pick only the Windows volume you might miss out on the Windows boot partition which contains the Boot Configuration Data (BCD).

kero99 commented 5 years ago

Because in the scripts i have different executions and ouputs directory by partitions type or filesystem :(. For example, if i have 50 different evidences with different partitions and different SO i execute different things and use different ouput directories by evidence and by partitions because i don't need execute the same with a Windows OS partition, or NTFS only data partition, HFS+ Partition or EXT4 partition.

Sorry for my English.

joachimmetz commented 5 years ago

but why does that matter for running plaso? I'm trying to understand your plaso use case here?

kero99 commented 5 years ago

Until last version i only calculate the partition offset one time and then run every forensics tools (all uses offset) by partition with offset. Now with the last version of plaso, offset is useless in and i need to calculate other less standard value "partition number". No problem with adapt my script but... Why is it necessary to remove the offset? I think this is a standard value very useful.

And the scary part of this is when you said: Keeping --partition besides --partitions for now.

joachimmetz commented 5 years ago

That still does not answer my question, why does that matter for running plaso for you? Couldn't you just change step 7 to do all partitions at once? What do you execute different for plaso on different file systems?

Why is it necessary to remove the offset?

Because of:

the number of times we get reports about people reporting plaso does not work because they mix up sector number and byte offset or get the sector size wrong, as I indicated from a UX perspective
we want plaso to support more complex layouts e.g. compressed images, fusion drives, multi disk LVMs, sw raids. In that context what does "offset" represent?

And the scary part of this is when you said: Keeping --partition besides --partitions for now.

These are currently aliases and we technically only need one, where --partitions is more descriptive of its purpose than --partition

kero99 commented 5 years ago

Couldn't you just change step 7 to do all partitions at once?

Because my scripts discriminate and analyze by partitioning not by full disk. When i want to analyze 300 machines from a DMZ with scripting, sometimes i don't want analyze D: E: F: G: H: L: M:... sometimes i only want the OS (usually C:). I can use fls, icat, mount, your vshadowinfo and all tools with offset... but not plaso now, this is the point.

The number of times we get reports about people reporting plaso does not work because they mix up sector number and byte offset or get the sector size wrong, as I indicated from a UX perspective

This is not entirely fair, sacrifice the advanced user for the most novel

We want plaso to support more complex layouts e.g. compressed images, fusion drives, multi disk LVMs, sw raids. In that context what does "offset" represent?

Yes, offset is not relevant in some scenarios, but it is in others. Following that philosophy, should not you remove the vss support too?

Do not take this discussion as a criticism of the project, i love plaso and your work, but I think it is healthy to have this kind of discussions =)

Best regards!

joachimmetz commented 5 years ago

Yes, offset is not relevant in some scenarios, but it is in others. Following that philosophy, should not you remove the vss support too?

You sound a bit frustrated and I'm trying to determine how I can best help you. Frustration does not help with the discussion. I need to understand your use-case, but the only thing I hear is that you want to use an offset to select a specific partition because of how your script/automation currently works.

I'm not hearing how you use plaso. What data you extract. If you use artifacts / parser / filter files selection than only those files are processes you specify. And why it is relevant to have a timeline per file system / partition? So that I might decide to add --partition_offset as an alternative.

Following that philosophy, should not you remove the vss support too?

It might be replaced at some point, e.g. currently we cannot cleanly handle multiple volumes with VSS.

This is not entirely fair, sacrifice the advanced user for the most novel

There many things not fair, but time spent on explaining "novel" users the same thing over and over is not time spent on improving plaso. While IMHO "advanced users" should be able to help debug, contribute PRs, describe use-cases.

kero99 commented 5 years ago

I feel if I look frustrated, it's not like that, it's just that I do not express myself correctly at all in English

I'm not hearing how you use plaso. What data you extract. If you use artifacts / parser / filter files selection than only those files are processes you specify. And why it is relevant to have a timeline per file system / partition?

Ok, how i use plaso: 1) Run mmls to check disk 2) Use a simple mix of fls/fsstat/regripper/xxd in every partition to detect the filesystem type, OS or not, encrypted or not... why? because on a disk you can have several file systems but in a partition you can only have one file system and only one operating system, correct? 3) Generate plaso execution depending on the filesystem detected, OS detected. For example:

if Win7/8/10 detected, i run webhist, winreg, prefetch and other... preset and parsers in a loop (from fastest to slowest) with dedicated filter. Why i use a loop and not full preset?, Because in every loop step i have info to investigate and i don't need wait that all finished. In IR the time is very important and plaso need this tuning to be fast.
If EXT2/3/4 detected i run only linux parsers (not all of them) and linux filters (no artifacts filters at the moment... because fail in some cases and not supported in others :( )
if NTFS detected but no OS... run other parsers and with no filters...

The principal idea is make plaso very fast for IR scenarios.

Best regards.

joachimmetz commented 5 years ago

Thx, this is useful context.

Know that there is OS detection in plaso that maps to a specific preset, with the recent version you can define your own presets for different versions of OSes by config file. We are working towards more artifact driven collection.

because on a disk you can have several file systems but in a partition you can only have one file system and only one operating system, correct?

You can have multiple OSes on the same file system, e.g. multiple versions of Windows, Windows/Linux hybrid.

I think what we could do in the mean time is add a --partition-offset option, which is a more specific name than --offset, but it sounds to me that you're duplicating functionality that is in plaso but does not fully matches your needs. I opt to further talk about these and see if some of these could/should be handled by plaso.

kero99 commented 5 years ago

I think what we could do in the mean time is add a --partition-offset option, which is a more specific name than --offset

Yeah!!!! That would be perfect =)

Know that there is OS detection in plaso that maps to a specific preset, with the recent version you can define your own presets for different versions of OSes by config file. We are working towards more artifact driven collection.

Could you tell me where to find more information about this? I could not find anything about that configuration file in the manual.

I opt to further talk about these and see if some of these could/should be handled by plaso.

Thank you soo much for your amazing work =)

joachimmetz commented 5 years ago

Yeah!!!! That would be perfect =)

Don't pop the champagne just yet (I said "could"). I'll evaluate the partition offset against other ideas on the table, e.g. having a shorthand way of defining dfVFS path specs (which would be able to address some of the other short coming of the current command line arguments approach).

Could you tell me where to find more information about this? I could not find anything about that configuration file in the manual.

Not yet in the plaso documentation, we mentioned it here: http://blog.kiddaland.net/2019/02/plaso-20190131-released.html

joachimmetz commented 5 years ago

Note to self: seeing that mmls can return overlapping information for hybrid mbr/gpt I think offset alone is not going to cut it (also see: https://github.com/sleuthkit/sleuthkit/issues/1444)

joachimmetz commented 3 years ago

Additional context why partition offset in combination with Sleuthkit can lead to errors https://github.com/sleuthkit/sleuthkit/issues/2123

log2timeline / plaso

log2timeline remove partition offset options #2232