log2timeline / plaso

Super timeline all the things
https://plaso.readthedocs.io
Apache License 2.0
1.7k stars 334 forks source link

Cannot reproduce filenames with "odd characters" from memory dump in plaso #323

Closed joachimmetz closed 9 years ago

joachimmetz commented 9 years ago

From @pettai in https://github.com/log2timeline/plaso/issues/316#issuecomment-136850968

I don't know if it's a separate bug/issue or if it relates to any of the improvements here, but I noticed that plaso doesn't seems to create an event for filenames with odd characters, like "svchost.exeÂ" or "svchost.exe?" (then extracted from a memory dump)

joachimmetz commented 9 years ago

Let's get a couple of things clear first:

pettai commented 9 years ago

Perhaps I wasn't clear enough: I mentioned this in the "improve NTFS support" issue, because it's on a NTFS filesystem a new file is created named "svchost.exeÂ" and this event never shows up then log2timeline.py process the disk image. That's the issue I noticed while using plaso from gift/stable and gift/testing ppa's. I haven't tried Bleeding edge yet, perhaps that issue was already fixed there?

(And then looking at a memory dump, the filename looks like "svchost.exe?", but that's just a secondary observation. Haven't tired using plaso on that dump)

Onager commented 9 years ago

@pettai this is still a little confusing. Let me see if I understand:

  1. You ran log2timeline on an NTFS file system that contained a file named "svchost.exeÂ"
  2. You then ran psort on the storage file from log2timeline
  3. When you looked at the output from psort, there were no events for "svchost.exeÂ"

A couple of questions then:

a. Is summary accurate? b. How were your accessing the file system from (1)? Is it in a disk image? If so, what format is the image? raw, split, ewf, vmdk? Alternative, is the file system mounted, instead? c. How do you know there's a file called "svchost.exeÂ" on the file system? Which tool is telling you this?

joachimmetz commented 9 years ago

Also where is this file name stored in memory in $UsnJrnl:$J ?

pettai commented 9 years ago

a) Yes, point 1-3 is how I did it.

b) First test run was .vmdk file (Windows Server 2008). I've now done a second test on another image, .vmdk, (also Windows Server 2008). here, log2timeline create an event for this file (named as "svchost.exe "), so for the other image it worked.

c) Regarding the filename, I looked thru everything again.

Regarding d) I will have to test rekall on the memory dump to see what it outputs.

So, it seems that log2timeline finds and creates an event for "svchost.exe " on at least another image that also had it on. I'll have to dig more to see why the first image fails.

joachimmetz commented 9 years ago

Based on your description I assume you are getting the file name from the MFT. So plaso is getting the file name from sleuthkit so you could check with e.g. fls what the sleuthkit returns. The '?' is a typical place holder character for unsupported encoding. The  looks like a codepage error.

The NTFS changes are not going to address this since if the filename is valid with a space is should be represented as such.

So, it seems that log2timeline finds and creates an event for "svchost.exe " on at least another image that also had it on. cmd, explorer et al shows the odd character as an extra space character (not as a Â), but that's hard to detect in a listing etc.

What we could do here in plaso is add an analysis plugin to detect file name anomalies like trailing spaces for known file names and have plaso tag those events.

Unicode spaces: https://www.cs.tut.fi/~jkorpela/chars/spaces.html

pettai commented 9 years ago

Argh, seems the first old dump I tried on plaso was broken :( It works for others that I've tested. Now I regard this as more of a search/presentation issue in timesketch. https://github.com/google/timesketch/issues/107

You may close this issue

joachimmetz commented 9 years ago

You may close this issue

Thanks for confirming, though I'll sync with @berggren as well if we not should be implementing an analysis plugin for this and tag such events appropriately. That can help timesketch show similar events as such. Since elastic search is based on lucene it will likely do a word based search for which literal searches are not optimal. A tag that clearly indicates this file name is made to look like svchost.exe IMO is a better long term strategy since we could use this in additional processing logic as well.

joachimmetz commented 9 years ago

Moving feature request to: https://github.com/log2timeline/plaso/issues/325