Open msuhanov opened 8 years ago
Good catch.
I see that there are two modifiers that we should be taking into account:
#define DEVPROP_TYPEMOD_ARRAY 0x00001000
#define DEVPROP_TYPEMOD_LIST 0x00002000
I'm not quite sure how these affect parsing yet, but I'm sure we can work that out to support lists and arrays of the other basic types (int, string, timestamp, etc).
To do this, we'd also need to check the "type modifier" by using DEVPROP_MASK_TYPEMOD
:
#define DEVPROP_MASK_TYPEMOD 0x0000F000
With this approach, we'd be inspecting the lower two bytes of the DWORD. Do you know what is placed in the higher two bytes, and if we can parse it? Are 0xFFFF0012 and 0x00000012 both DEVPROP_TYPE_FILETIME
? If so, this is an easy fix. If not, how do we resolve the type?
In any case, I suggest that we update VKRecord.data_type()
to at least use the mask 0xFFFF (which is DEVPROP_MASK_TYPEMOD | DEVPROP_MASK_TYPE
), and potentially returning the unmasked value (if there's no other option).
Do you know what is placed in the higher two bytes, and if we can parse it?
On Windows 10, when storing device properties, these bytes are always set to 0xFF. But, as I said before, the higher two bytes in other keys can be used by something else.
Ok, then I propose updating VKRecord.data()
, VKRecord.data_type()
and VKRecord.data_type_str()
to handle the common case by masking the type down to the lower two bytes (using DEVPROP_MASK_TYPEMOD | DEVPROP_MASK_TYPE
), and adding a method VKRecord.raw_data_type()
that returns the unmasked data type.
The benefit is that this allows an advanced user to parse unusual structures manually using VKRecord.raw_data_type()
and VKRecord.raw_data()
, while keeping the common case very simple (if vk.data_type == RegMultiSZ: ...
).
An issue is that most developers won't realize this could happen, and will assume that VKRecord.data_type()
returns perfectly correct data. Clear documentation on these methods will help, but won't totally remove this issue.
I'm trying to balance ease-of-use of the module with providing the most correct data. As a developer, what would you most like to see?
Sounds good.
As a developer, what would you most like to see?
I think that the best solution is to introduce the data_type_devprop()
method and make the data_type()
method return a raw value type. Any other custom value types will have the corresponding data_type_*
methods. But this breaks backward compatibility.
+1 on the issue - I just ran into 0x00002012 today (along with 0x00000011 and 0x00000012, which python-registry does not handle).
@geoffblack having read the above discussion, do you have any opinions on how the API could be enhanced?
An application (or a driver) can specify any value type possible. You may also find this post interesting.
I think introducing separate methods per data type set is the wrong thing to do, and not just for backwards compatibility. In the short term, I like fixing the mask and adding raw_data_type() - it gets what I need as the user to do my own decoding with raw_data(), and it's a quick fix.
In the long term, you could build profiles of data type definitions, where data() takes a parameter for the data type set. The default would be standard registry data types already included in python-registry, but you could add other data type lists like devprop_type* from devpropdef.h over time, and the necessary code to support decoding those data types. The user would still have to know which data type set to use in certain areas of the registry, but it would make it easier for the user.
@williballenthin I've just read through raw_data() and realized your implementation there is problematic for this, as well. Take, for example, the case of RegMultiSZ (0x07). In the DevPropType list 0x00000007 is a uint32. I have not tested this, but there's a case in there where you're not actually returning the raw data (!!!), but instead returning and empty byte string because, "it must be 4, and be composed of completely \x00, so the strings are empty." So, I'm not sure raw_data_type() is a quick fix if raw_data() can't be relied upon with non-default data types.
Just to follow up - I do indeed have a registry value with 4 bytes of data where raw_data()
returns an empty byte string due to using the wrong data type as described above.
acknowledged. sounds like i need to sit down one weekend and implement some of these approaches to figure out what works best.
is there any chance you can share the hive? i understand if you cannot.
You can find samples of these values in the SANS Donald Blake Windows 8.1 image in various keys under Control\DeviceContainers
. If you don't have that lying around, shoot me an email.
thanks!
Is it possible to get the image? Not just registry files.
Hello.
At present, the
data_type()
method of theVKRecord
class applies theDEVPROP_MASK_TYPE
(0x00000FFF) bit mask to a registry value type using the AND operation, and thus clears bits#12
-#31
in this value type. I know that the purpose of this operation is to extract theDEVPROP_TYPE_FILETIME
(0x00000010) type (you call itRegFileTime
), but there is a major issue here:DEVPROP_TYPE_FILETIME
),DEVPROP_MASK_TYPE
bit mask when parsing a value type.Based on this, the library can't silently clear the bits in a value type. For example, the PnP subsystem is using the
DEVPROP_TYPE_STRING_LIST
(0x00002012) type to store device location paths in the registry (this type will be converted to following value: 0xFFFF2012). Reading such a type withpython-registry
will return 18 (0x12), orDEVPROP_TYPE_STRING
, if we resolve this constant in the context of device properties, but this value is wrong, because it doesn't match the value stored by the subsystem.