Closed GoogleCodeExporter closed 9 years ago
Hiya Joe,
Thanks for submitting this patch. It seems as though it's effectively
deprecating the memory_model setup for an arch one. That may well be the best
way to go, but if we're going to do that we should do it across the whole
codebase, and we should do it before 2.2, since the memory_model is (I believe)
new in 2.1.
So there are two options, and I'm going to ask for opinions on them:
a) Have the ARM architecture dig out the the appropriate value from the linux
profile (bad, because it's making the AS have profile-specific code in it).
b) Convert all memory_model checks to arch checks. The question is, will there
ever be a single arch that can be both 32-bit and 64-bit at the same time?
So this needs a bit of thinking about before applying I'm afraid. However,
it's thinking we'll need to do soon if we don't want it to delay the 2.1
release, so hopefully people will jump on this...
I'll try to find some time to mock-up a complete memory_model replacement patch.
Original comment by mike.auty@gmail.com
on 1 Jul 2012 at 10:49
Original comment by mike.auty@gmail.com
on 1 Jul 2012 at 11:18
I'm not sure if the best route is to depreciate memory_model checks all
together. There may certainly be architectures that support multiple memory
models. They did announce last year that there would be 64-bit ARM support in
v8 (although that might be considered a different architecture all together).
http://www.eetimes.com/electronics-news/4230160/ARM-unveils-64-bit-architecture
But what about ARM systems that are big-endian and that are little-endian?
We'll eventually run into that problem, once people want to use the code for
more than just on Android devices.
Maintaining the memory model checks also would help abstract plugins that just
need to know a specific memory model (32-bit or 64-bit, big-endian,
little-endian) but don't really care about the architecture.
Something to think about.
Original comment by Joe.Sylve@gmail.com
on 1 Jul 2012 at 1:37
Another thing to think about with this is arch-specific overlays. Right now we
are overlaying the x86 cpuinfo struct, but obviously that does not exist on ARM
so you get a warning about the struct missing from the vtype each time you run
a plugin against Android.
Original comment by atc...@gmail.com
on 1 Jul 2012 at 6:58
Hmm, I'm aware of several mixed archs scenarios:
1) The one Joe mentioned about ARM
2) Windows WOW64 (32 bit process AS on x64 kernel)
3) The OS X issue attrc mentioned on the [[MacMemoryForensics]] wiki about x64
processes on an x86 kernel
Original comment by michael.hale@gmail.com
on 2 Jul 2012 at 3:30
So I guess I'm a bit confused about the problem with this patch. I don't think
it's wise to depreciate memmodel with arch. It seems that in some places it
would be best to check against arch and in others memmodel (for instance x86
native types are the same as 32 bit ARM native types, etc). What do we need to
do to get this done properly?
Original comment by Joe.Sylve@gmail.com
on 3 Jul 2012 at 2:53
Sorry, we're all quite busy, but this needs thinking through before it's done,
because it appears to be mixing some elements together that shouldn't be mixed.
The Profile takes care of how the data's laid out in memory. The Address Space
takes care of how to read memory to start pulling data out of it. Why does the
profile (how you read data) need to be aware of which architecture/address
space it works on?
So I guess I'm asking how would the Debian-for-Arm-32bit be different to the
Debian-for-Intel-32bit? Why should a profile care whether it's Arm or Intel?
Why can't the address space figure out whether it works or not, rather than
checking to see which profile it's running as to decide if it might be valid?
The 32bit/64bit is for plugins to differentiate which one they're working on,
but at the moment, the sole use for the arch metadata appears to be to hide the
ARM AS from normal profiles because the code to check whether we're dealing
with an ARM space or not just doesn't work. If so, that's the bit that needs
fixing, if not then I think I could do with a better explanation of why.
As to the cpuinfo plugin, I'd hope that it's possible to rewrite that to use
sysmap, rather than relying on the profile to tell it whether it exists or not.
That information must surely already be in there? It should certainly be
possible to write a ProfileModification that doesn't even attempt to overlay
the cpuinfo if it isn't already present.
I don't want to rush into adding stuff to the metadata set because it's an easy
hack, I want it to be added after we've thought it over because it's necessary
and it's the right thing to do.
Original comment by mike.auty@gmail.com
on 3 Jul 2012 at 6:22
Mike,
It looks like this patch modifies the is_valid_profile() test in the address space to check what the profile is like. This is perfectly ok - an address space does need to check whether its compatible with the current profile (some address spaces need to use the profile to parse their internal data structures - e.g. crash, hiber etc - these do very similar tests to the profile albeit in their constructor instead of the is_valid_profile method).
Indeed I agree with you that the profile does not need to know about the
address space, but the opposite is happening here.
IMHO we need to have both pieces of metadata, since they are both important
(arch and memory model). The page translation mechanism needs to know which
hardware it is running on, information it can only get from the profile.
This only affects the address space voting mechanism to ensure we dont consider
an address space which is not compatible with the current profile.
Original comment by scude...@gmail.com
on 3 Jul 2012 at 11:55
So what happens with multi-arch profiles? It seems as though the actual data
structures should be identical between Arm and Intel for the linux kernel on
the same memory_model. Having two profiles hanging around just so one can say
Arm and one can say Intel seems like a waste.
Previously the page translation mechanism didn't need to know which hardware it
was running on, it ran its own tests (based on the memory_model) to determine
whether its page translation mechanism could be in use (for example,
Pae/NonPae). Now we seem to just not even be bothering doing proper AS
checking, but just deciding whether the user asked for it or not. I think the
right way of solving this is for the ArmValidAS to be improved to properly
differentiate between possible ASes, and they should then be ordered by
likelihood. Arm should not be the first AS we try, since it's very, very
unlikely to be the right one most of the time. This also means we need to make
sure our intel checks fail for most Arm systems if possible.
However, if this patch wants to go in, which it seems like everybody else is
suggesting it should, then at a minimum it needs to think about more than just
Arm/linux. Every address space should have an appropriate arch added, and
every profile that doesn't work with Arm needs to say so. This is not just a
minor change in isolation, metadata changes affect every profile and given
their use, every AS as well, hence the care that should be taken in planning
them.
@Joe, could you please mock up a new patch that takes into consideration the
whole codebase as opposed to just this one part, and then we can review it
again? This would all be much easier if there were a proper check in
ArmValidAS, but I'll commit it without given so many people think it's
necessary.
@attc, the ARM space has had far less testing, and I'm concerned it will push
the 2.2 release past OMFW if we try to merge it into trunk in the next
iteration. I'd like to get your opinion on branching from r1902, just before
the Arm commit?
Original comment by mike.auty@gmail.com
on 4 Jul 2012 at 7:33
I am not sure what you mean with multi-arch profile. A profile contains all you
need to parse data _on a specific arch_. Indeed in this patch there is a
difference between the architectures which the profile encapsulates (through
the system symbols).
The page translation mechanism must know what hardware it is running on (its
emulating hardware after all). Address spaces already perform tests which _are_
based on the profile. For example:
http://code.google.com/p/volatility/source/browse/branches/linux-trunk/volatilit
y/plugins/addrspaces/hibernate.py#81
The address space voting mechanism is only a convenience - whatever checking is
done can easily be bypassed, so we need to allow the user to specify it (that
is another problem though :-)
The whole point of the metadata concept is that it does not have to be present.
If other profiles do not export the arch method it can be defaulted or ignored
or whatever. If this specific address space is looking for a specific
architecture, this does not affect anyone else. I dont think this particular
patch should be modifying other unrelated address spaces, instead it should
have its order specified before all the other ones and specifically select an
arch metadata (which may not exist on other profiles). By doing this it should
win the voting round before the others. In that I disagree with your statement
that every profile which does not want to work with ARM should do something -
this patch should only affect the subsystems concerned with it without
modifying the entire code base.
Regarding you comment on testing. I agree whole heatedly. Are there any ARM
images we can test with that can be published?
Original comment by scude...@gmail.com
on 4 Jul 2012 at 8:32
What I meant by a multi-arch profile is a profile that is relevant to multiple
architectures. There is no inherent reason why a structure made up of four
integers should be any different on Arm than on Intel x86 (assuming the
endian-ness is not an issue). I disagree with you on what a profile holds, to
me it is a collection of structures necessary to parse data, and is
architecture independent. I would not expect to have to choose a different
profile if I was using PAE rather than NonPAE. I had rather hoped that 64-bit
systems would be able to share a profile with only a change of native_types,
but unfortunately the vtype system encodes specific relative offsets for each
item in a structure, making that impossible. I would still expect an amd64 AS
to be checked if the profile were 32bit, and succeed if it were indeed an amd64
space.
The test you pointed out for hibernation files is to stop the rest of the
plugin blowing up because a required structure isn't present. That structure
*could* be added to every profile if necessary, it would not prohibit the
address space running as the arch metadata would, because each profile could
only ever specify a single architecture! The example wasn't a metadata check,
it was a profile contents check. If the ARM space relied on certain structures
being present, I would expect it to check the profile for them.
I agree, the point of the metadata check is that it does not have to be
present. Unfortunately, I've seen branches with sloppy coding practices where
people attempt to directly address the variables, rather than using the
dictionary interface, so I'd prefer to provide defaults to save those coders
from themselves. Either way, it will fail open, so anywhere that a check
*doesn't* exist will succeed. When adding in such a fundamental property as
"arch", I'm not sure failing open is the behaviour we want.
Also, with the patch as it stands there is very little stopping an ARM profile
instantiating against an intel space save for the fact that ARM is ranked
higher than it. What happens when we add another space on top of that? We
can't keep adding metadata variables to help us choose which AS is used, the
ASes should take care of that as much as possible on their own.
If a profile were only good for one type of address space, then profiles should
have been built that way, but it's not the case (pae), so why put that limit on
ARM profiles specifically?
Original comment by mike.auty@gmail.com
on 4 Jul 2012 at 12:41
Mike, I fully agree with you that a profile should not generally care about the
address space it is instantiated on. This is not the actual problem right here.
The problem is that the address space requires the profile to have certain
symbols before it can be instantiated. i.e. the address space depends on the
profile.
This patch achieves this dependency indirectly via setting an arch metadata on
the profile (in order to abstract the actual check for the ARM arch), but
ultimately the ARM AS is declaring that it will only work if the profile is
suitable (i.e. contains the right symbols). In this way its the same as the
hibernation AS.
The rank problem you pointed to is inherent in every guessing system.
Essentially the problem of selecting address space is not related to the
problem of selecting the profile. Currently the user selects the profile, and
we guess the AS. We might get that guessing wrong. If the user could specify
the AS as well there will be no problem. Commonly the choice of the AS (in this
case ARM) is really synonymous with the choice of the profile which means
making the AS depend on the profile in this case is reasonable.
If the system erroneously instantiates an Intel AS for an ARM profile it means
the voting mechanism needs to be tuned - this will happen from time to time as
with every system that attempts to guess something.
I don't think the metadata system fails open - if the ARM AS checks like:
if profile.metadata.get("arch") == "arm":
this will not pass if arch is not defined, and does not require every profile
to define this metadata.
Original comment by scude...@gmail.com
on 4 Jul 2012 at 1:05
Great! Since we're in agreement, why don't we change the code to check the
profile for the symbols it actually requires which we both agree needs to
happen, rather than adding a new arch metadata variable which we don't agree
on, and that way everyone will be happy? 5:)
Original comment by mike.auty@gmail.com
on 4 Jul 2012 at 1:24
One comment...
"It seems as though the actual data structures should be identical between Arm
and Intel for the linux kernel on the same memory_model. Having two profiles
hanging around just so one can say Arm and one can say Intel seems like a
waste."
This isn't true. There are many arch-specific structures that are used. One of
them is the cpuinfo struct, which is x86_cpuinfo on Intel and something
completely different on ARM. It gets even worse when you try to recover
architecture specific features.
Original comment by atc...@gmail.com
on 6 Jul 2012 at 6:30
Original comment by mike.auty@gmail.com
on 29 Jul 2012 at 9:52
Okay, let's try this again.
This patch does add arch metadata to AbstractLinuxProfile, but it also contains
examples of why this is a good thing and is useful.
This allows arch specific overlays. For example, cpuinfo_x86 is overlayed, but
this structure obviously only exists on intel processors, so you get an error
every time you run a plugin on ARM. This fixes that.
This also fixes the naming of profiles. Android ARM profiles will now be
correctly named LinuxWhateverARM instead of LinuxWhateverx86.
I removed the is_valid_profile stuff for now, because that seemed a bit too
controversial, but I would like to revisit that discussion, because it does fix
the problems with voting between x86 and ARM, but I understand that voting is
going to be changing, so I'll wait and see.
Original comment by Joe.Sylve@gmail.com
on 25 Aug 2012 at 7:41
Attachments:
This issue was closed by revision r3359.
Original comment by labaru...@gmail.com
on 9 Apr 2013 at 5:49
Original issue reported on code.google.com by
Joe.Sylve@gmail.com
on 29 Jun 2012 at 2:13Attachments: