Matmaus / LnkParse3

Windows Shortcut file (LNK) parser
MIT License
63 stars 13 forks source link

support decoding of Darwin blocks #13

Closed japm48 closed 3 years ago

japm48 commented 3 years ago

Some MSI-generated shortcuts are shown in the explorer properties window with the location greyed out.

Example: example.zip

image

Apparently, those are called "Advertised shortcuts" and the actual link location has to do with the field DarwinDataUnicode in the DARWIN block. This has a format GUID(product code), string(component id), GUID(feature name) where the GUIDs are Base85 encoded.

I would like to request if possible for the automatic decoding of this.

More info: http://www.laurierhodes.info/?q=node/34 http://metadataconsulting.blogspot.com/2019/12/CSharp-Convert-a-GUID-to-a-Darwin-Descriptor-and-back.html https://web.archive.org/web/20080323160816/http://support.microsoft.com/kb/243630

Matmaus commented 3 years ago

Hi, thank you for a good description and all the resources you provided :+1:. I tried to implement the requested features and it seems it is working. At least product_code_id is the same as id in data.icon_location. I am not sure about the component_id, but it is probably correct as well. Can you verify it somehow? You can try to install the package from the improvement-darwin-block branch using pip install git+https://github.com/Matmaus/LnkParse3.git@improvement-darwin-block if you want to test it.

Also, can I add the sample you provided into the test dataset?

japm48 commented 3 years ago

Thanks for the quick response and implementation! And sorry for my late response.

This is a mostly unused and undocumented feature, so it took me quite some time to find info and (more or less) understand it.

The sample file is from a proprietary commercial program, so I thought it might not be a good idea to include it in the tests and instead use something else. I have been playing with Wix and MSI installers to create a simple example: you can download it here and you can use it directly (no need for attribution, etc.).

Also, I didn't find a way to generate a non-empty component_id and couldn't find anything that uses it (apparently some old MS Office version, that I'm not willing to test), so I suppose that it should be correct as it is now.

Matmaus commented 3 years ago

Thanks for a new test sample. I will use it in the test suite as well as its modified version where I replaced the value with the value from https://metadataconsulting.blogspot.com/2019/12/CSharp-Convert-a-GUID-to-a-Darwin-Descriptor-and-back.html (see commit). All new fields are parsed as shown on the page.

I have modified the implementation to be more strict. The new test sample (you provided) has not even a name, and the used terminator seems to be < instead of >. Is it valid? The current implementation will consider both feature_name and component_id empty and will warn a user UserWarning about it.

Since the current implementation can parse Darwin block as described in the provided materials, I think I can merge the PR, release a new version, and close the issue. I will solve future issues (e.g. a list of possible separators/terminators) as they will come (if they come). Do you agree?

japm48 commented 3 years ago

Oh, I didn't realize it was empty! (I thought the product code and component were the same).

I just found this: https://community.broadcom.com/symantecenterprise/viewdocument/working-with-darwin-descriptors

In summary, there is one more detail: if there is no last field it ends in < instead of using > as separator.

If I find the time, I'll add more varied examples.

Edit: also, I realized that in the last site I found, product codes and component ID are swapped. So one source has to be right and the other wrong.

Matmaus commented 3 years ago

Oh nice, this post looks pretty useful :+1: .

I have changed the implementation and now it will warn the user only if it does not follow any of the four described formats. If there should not value according to the used format, there will be no value (None).

Edit: also, I realized that in the last site I found, product codes and component ID are swapped. So one source has to be right and the other wrong.

Since all the pages you provided are consistent in this, I consider it is the right way (but good to know for the future).

If I find the time, I'll add more varied examples.

I'll be thankful if you would provide some test examples in the future.

Thank you, do you agree with finishing the issue now?

japm48 commented 3 years ago

Thank you, do you agree with finishing the issue now?

Sure! And thanks a lot! I was about to try to implement a link parser myself until I found this in PyPI.