ycphs / openxlsx

openxlsx - a fast way to read and write complex xslx files
https://ycphs.github.io/openxlsx/
Other
226 stars 75 forks source link

Assign Sensitivity Labels #461

Open AdamGaffney96 opened 10 months ago

AdamGaffney96 commented 10 months ago

Is your feature request related to a problem? Please describe. In my work we have a lot of systems set up to print workbooks using openxlsx in R, since R is where we do most of our data analysis. Recently our work introduced Microsoft sensitivity labels. The issue we have is that many of these documents are used by many different people and systems, and are produced sometimes multiple times a day. Plus the datasets can be massive, resulting in large Excel sheets. Unfortunately it seems that, perhaps due to how the saveWorkbook function works, persistent workbooks are essentially deleted and then created again, meaning that the sensitivity label does not persist through this save, and needs set every time.

Describe the solution you'd like The ideal solution for me would be either a function added to the package, or an option added to the saveWorkbook function which allows the setting of a sensitivity label. I'm not too bothered if this is using siteID and labelID like the above VBA code, or if you can copy it from another workbook or something like that but it would be much appreciated if something could be added to allow these to be set at the R level. Hopefully this is possible!

Describe alternatives you've considered I have set up some VBA macros to automatically set sensitivity labels for certain persistent workbooks that are created regularly (see this SO post). However this involves opening the workbook, copying the site and sensitivity label ID to that workbook then saving and closing again. Since our documents are large, and are hosted on a central server, it can take upwards of 10-15 minutes to set these labels, and this needs to be done multiple times per day. This is causing a lot of time wasted during the day. Unfortunately the team that rolled this out have no desire to remove the sensitivity labels, and it also seems like they can't come up with a better solution than the one we have implemented so far. I think this would be the best for us if possible since the actual adding of a sensitivity label takes a couple seconds max - the time wasted is from having to open and load, then close and resave the workbooks every time.

Additional context I think sensitivity labels are still quite new so might not be something you're aware of or have used, but hopefully there can be a solution. I'm happy to provide more details if required.

Thanks, Adam

JanMarvin commented 10 months ago

Hi @AdamGaffney96 , I've had a quick look. Apparently there is some kind of custom xml label attached with sensitivity labels. Though I couldn't actually verify this, because it looks like sensitivity labels are some kind of corporate snake oil and I have no access to a corporate MS365 account. You could have a look if it works with openxlsx2, we should already support this kind of custom file properties.

AdamGaffney96 commented 10 months ago

Hi @AdamGaffney96 , I've had a quick look. Apparently there is some kind of custom xml label attached with sensitivity labels. Though I couldn't actually verify this, because it looks like sensitivity labels are some kind of corporate snake oil and I have no access to a corporate MS365 account. You could have a look if it works with openxlsx2, we should already support this kind of custom file properties.

Yeah unfortunately it was something suddenly sprung on us, with no testing that we now have to deal with, pretty standard stuff for this kind of feature. I'll give openxlsx2 a look on Monday and see if I can maybe make it work! It might be something I can share here for anyone else that has this issue. Which functions within openxlsx2 would be used for setting any custom xml? I can't seem to find it myself, and the wb_set_properties() function seems to only apply fixed ones such as title, author etc.

JanMarvin commented 10 months ago

I'd start with the following. Save a blank file in Excel. Try to modify it with openxlsx2 and see if the sensitivity label remains intact.

wb_load("file.xlsx")$add_data(x = mtcars)$open()

If it works, see if you can do the following:

wb <- wb_workbook()$add_worksheet()
# move the custom.xml part over
wb$custom <- wb_load("some.xlsx")$custom
# add it to the `[Content_Types].xml`
wb$append(
  "Content_Types",
  "<Override PartName=\"/docProps/custom.xml\" ContentType=\"application/vnd.openxmlformats-officedocument.custom-properties+xml\"/>"
)

if(interactive()) wb$open()

If it works, let me know and we could further investigate and create a nicer API for this in openxlsx2.

PS: I have a file with custom properties that look like this (various property nodes), but like I've said, to me it opens just like a normal xlsx file.

<property fmtid="{...}" pid="..." name="MSIP_Label_...">
<vt:lpwstr>...</vt:lpwstr>
</property>
retodomax commented 7 months ago

It works! If I run your code above, it opens a temp xlsx file with the correct Label.

Would be great to have a function in openxlsx2 which makes this easier.

Do I understand correctly that "some.xlsx" in the code above has to be a template excel file?

JanMarvin commented 7 months ago

Hi @retodomax , thanks for the confirmation! Yes, exactly, test.xlsx is a template that contains the required custom.xml content. But it would be possible to extract it once and save it as a string in a file, so you dont have to load a file every time you want to create an xlsx file. I can add such a function toopenxlsx2`, but as I said, I have no way to test if it works. My company does not use this feature and I cannot enable it for my personal MS365 account. Therefore third party testing is required.

retodomax commented 7 months ago

Hi @JanMarvin, thanks for the explanation. I'm happy to test it and provide you the output, if it helps.