Closed GoogleCodeExporter closed 9 years ago
Hmm good point. A file length should be limited to MAX_PATH characters (which
is actually 260 and not 255 anyway, so that should be fixed). The other point
you bring up is that many other plugins inherit from FileScan and use
parse_string, and those other objects may not be limited to MAX_PATH.
Original comment by michael.hale@gmail.com
on 9 Oct 2011 at 2:48
Not in front of my computer right now. The parse string function is removed
in my branch in favour of Unicode string. This is possible by extending v()
to support a different as.
This might remove the need for special helper methods.
Original comment by scude...@gmail.com
on 9 Oct 2011 at 8:10
Ah yeah, that is an option. We should tie this in with Issue #87 also (where
scudette originally described his replacement for parse_string and
OBJECT_HEADER stuff).
Only thing we'd have to check is make sure everything that uses parse_string
through inheritance is really a UNICODE_STRING and not just an LPWSTR.
Original comment by michael.hale@gmail.com
on 10 Oct 2011 at 3:08
v() seems like a convenient solution, rather than a good one. It's our problem
that we instantiated an object that contains a pointer using a physical address
location, and the object API is not the right place to compensate for that.
For most objects, asking for their value and handing in a different AS makes no
sense (particularly without changing the offset). And for most users having a
parameter which is never used makes no sense. Given that, and how it's nearly
impossible to document well, it's not a good API to support. This function is
*only* useful for specific scenarios where the value of an object depends on
dereferencing another.
I would much rather a way for objects to know which AS they'd been instantiated
in, and whether a subobject dereference would work. Since that would be quite
complex, I think it would be better in the short term to have the helper
function (formerly known as parse_string) corrected, and added to the
UnicodeString object as an additional function, which will allow getting the
value from the object when it's been defined in a physical address space. It's
currently called UnicodeString.virtual_v(virtual=vm), but I'm looking for a
better name so please suggest some.
This would then make it clear to the user/developer what was going on,
centralize and properly organize/locate the helper code and keep the rest of
the API unharmed.
I've worked up an initial patch to remove parse_string support and would really
like a code review of it. It ditches FileScan.kernel_address_space, since
carrying it around in the plugin seemed worse than passing the translated
objects through to calculate. It does still make use of _OBJECT_HEADER.kas,
and it doesn't do anything concerning _OBJECT_TYPE. It'll take me a little
longer to port that from scudette's work. It does change the calculate output
from some plugins, but all of the inherited plugins overrode the calculate
function.
Lastly, I'd also appreciate knowing how best to fix up the code taking into
account MAX_PATH. If someone knows which of the uses of parse_string may be
LWPWSTRs instead of UNICODE_STRINGs please let me know...
Original comment by mike.auty@gmail.com
on 16 Oct 2011 at 8:42
Attachments:
I disagree with this. We do need a wider discussion if we are to get a
better more flexible API.
Indeed the whole point of the API is about convenience. You can always
re-instantiate Object() over the top of different offsets as it is
currently done in the helper functions - but that makes it tedious and
error prone (as is evident in the different instances of the same
helper functions peppered around the place). The point of the API is
to make common things seemless and I dare say that switching address
spaces is a very common operation in memory forensics.
It is often not our choice which address space we start in - e.g. file
scanners may start with physical and want to switch to virtual. There
is no "fault" as such - it just depends on the analysis technique.
I disagree. The v() function mean, evaluate this object in some
context - an AS is critical to the context. In many objects v()
returns a BaseObject instance, for pointers it mean dereference etc.
It makes sense to provide the address space in which it should be
evaluated right there. May I also remind you that way back in 1.2 and
1.3 days we instantiated every object with an AS parameter - so every
time we dereferenced a struct member or a pointer we provided an AS
and did switch from one to the other all the time.
That is the point of a default parameter. It is sometimes used, but often not.
I disagree. The meaning of v() is object dependent anyway for all
objects. For example for pointers it means to dereference the pointer
and returns an instance of the thing it was pointing to - for integers
it actually returns an integer (i.e. not a BaseObject instance at
all). The current API seems inconsistent enough - its still reasonably
documented and pretty obvious what we should get.
Indeed, the address space simply specifies the environment which
should be used to dereference it. Since an integer has the same value
regardless of address space it is not affected - however a
UNICODE_STRING does depend on the address space etc.
Objects do know which AS they are instantiated in. I am not sure what
you mean by subobject.
Would you also add additional methods to every other Object ? For example:
- A pointer would need to be derefered in a different AS - pointers
should get virtual_v().
- A linked list would most definitely need a virtual_v().
- A UNICODE_STRING would need a virtual_v().
The only thing which does not need a virtual_v() are the native types
(an integer has the same value in all AS's).
Would a call to virtual_v() with no args be exactly equivalent to a
call to the current v()? and if so can we just alias one to the other
so we dont need to re-implement the same thing multiple times
increasing the maintenance costs?
I am pretty sure that if a developer wants to switch AS's they have a
reason to and understand why. If they dont understand what its for,
they are unlikely to provide this optional value. Lets give our
developers a bit more credit.
I am against adding new ad hoc methods to extend the API in
inconsistent ways. I would rather the API be extended in a logical
coherent manner - yet with backwards compatibility as much as
possible. I see the addition of a new method which is mostly the same
as another method but with a small tweak as an awkward API. This would
be ok in a language like C which does not provide optional parameters,
but in python this is just ugly.
Original comment by scude...@gmail.com
on 16 Oct 2011 at 9:43
I'm sensing that you disagree with this... 5:)
How did you get "evaluate this object in some context" from the single letter
v()? Even the docstring doesn't mention context. The point of proxying the
objects was so that people didn't even need to call v(), and as it currently
stands v() is most commonly used to get the actual python object rather than a
proxied one (and it's a shame that the v() function is needed at all). There
are currently no circumstances where v is context dependent, the object itself
is supposed to maintain its context, and that's what I was trying to get across
but seem to have done a bad job of communicating.
If we want to allow objects to be instantiated in ASes that they were not
originally created in, then an object should either hold all the state about
which ASes it can/should be interpreted in, or none of them. Hanging onto one,
that may be the wrong one, and then not even knowing which one it is, is not a
good idea.
Currently we've got effectively two different types of object floating around,
those that are instantiated in VAS, and those in PAS. The PAS ones are broken,
and should not allow any member to be dereferenced, but currently do. It would
be best if the object were aware about all the possible interpretations it
could have (offset/vm for PAS, offset/vm for VAS and possibly even offset/vm
for process space). That way when it's asked for a member, it can figure out
whether it has enough information to provide it or not. There would be a
specific way of telling an object which spaces it can exist in, and where it
lives in each space.
If we don't think the objects should be carrying their own state around, but
merely be templates to apply to an AS, then we should go back to explicitly
providing the AS every time we ask for a value. What we've got at the moment
is a middle ground where we assumed we'd only instantiate valid objects in the
AS they were created in, and that's worked so far, but is now causing this
discussion.
The current API shouldn't be inconsistent, and if it were, that's not a good
reason to continue on in the same way, it's a reason to fix it. If you check
the code, Pointer.v() returns the integer value of the offset it's pointing at,
and I believe it has done since we first wrote that code.
Pointer.dereference() returns the dereferenced object (and makes use of .v() to
get the target offset). As such this seems consistent to me, and indicates
that we provide special functions to certain objects when they're required,
such as pointers.
You're suggesting that there are three cases where we need support for
switching ASes:
- Pointers I'm relatively happy to add a parameter to dereference_as (and in
fact, this is in the patch I provided), because it makes sense that a pointer
in one space may point to an object in another, and it makes sense to add it to
that function (although you might question why there's a dereference and a
dereference_as, given your argument about default parameters).
- UnicodeString's v() is just another convenience shortcut so that people don't
have to dereference the buffer themselves. If they're working across two
different ASes, then they should have to dereference it themselves, or use a
second function to do so. Overloading v() is just lazy, and in my opinion
changes the meaning of v(), which seemed quite clear to me before this.
- If linked lists need the ability as well, then it should be evaluated and
added as necessary. I think three types across all the types that are normally
accessed (every different variety of NativeType and every different variety of
CType save for UNICODE_STRING) is a tiny amount, and not a good enough reason
to add a dummy parameter to every type in existence. It's skirting around the
problem, rather than designing a good way of handling it.
At the moment a call to virtual_v with no arguments would be equivalent to
.v(), however we can require that an AS always be provided to virtual_v() if
you'd prefer to distinguish this specialist helper function from the standard
v() function. If you want to alias them for that class then you will find
people running into "differing numbers of argument" errors, and I've already
explained why I'm against adding it to every single type in existence for the
sake of three specialist cases.
If a developer wants to switch ASes, they currently have to overlay the object.
With either of our solutions they call an additional function and hand in the
AS, so on that front there's no difference between us.
Having the ability to add ad-hoc methods to objects when they required it is
exactly why object classes exist, and precisely why every object doesn't have a
"get_object_type" function, or a "get_process_address_space" function. These
are specialist functions not required by all, and therefore best defined in the
objects that need them.
However, since it's clear we differ on this matter, it's probably best to get
some external input. Anyone else, comments, suggestions, etc please?
Original comment by mike.auty@gmail.com
on 16 Oct 2011 at 10:50
The parse_string function has now been removed in favour of making use of the
UNICODE_STRING implementation. As such, the length issue on this bug is fixed,
marking as such.
Original comment by mike.auty@gmail.com
on 22 Jan 2012 at 6:35
Original issue reported on code.google.com by
carl.pulley
on 9 Oct 2011 at 12:39