pbowden-msft / MAUCacheAdmin

Microsoft AutoUpdate Cache Admin
70 stars 14 forks source link

Inconsistent Last-Modified and ETag headers #39

Closed nixtar closed 1 year ago

nixtar commented 1 year ago

Hello,

Sorry if this is not the right repo to log this. Happy to log a Premier Support Case with MS if that's more appropriate.

We are seeing an issue where files headers are changing quite often ( every few seconds ).

For EG using Powershell to reproduce:

$uri = "https://officecdnmac.microsoft.com/pr/C1297A47-86C4-4C1F-97FA-950631F94777/MacAutoupdate/Microsoft_Excel_16.30.19101301_Updater.pkg"
$webRequest = Invoke-WebRequest -Uri $uri -Method Head -UseBasicParsing
$webRequest.Headers

#Key                       Value                                                                                                                       
#---                       -----                                                                                                                       
#ApiVersion                Distribute 1.1                                                                                                              
#Content-Disposition       attachment; filename=Microsoft_Excel_16.30.19101301_Updater.pkg; filename*=UTF-#8''Microsoft_Excel_16.30.19101301_Updater.pkg
#X-Azure-Ref               01NMWYwAAAACYwdL1QsBoSr6JkQGl467sQ0hHRURHRTE1MDgAY2VmYzI1ODMtYTliMi00NGE3LTk3NTUtYjc2ZDE3ZTA1Zjdm                           
#X-Cache                   TCP_HIT from a203-9-184-5.deploy.akamaitechnologies.com (AkamaiGHost/10.9.4-44125806) (-)                                   
#Connection                keep-alive                                                                                                                  
#Akamai-GRN                0.05b809cb.1663903361.efc4a3                                                                                                
#Akamai-Cache-Status       Hit from child                                                                                                              
#Strict-Transport-Security max-age=15768000 ; includeSubDomains                                                                                        
#Accept-Ranges             bytes                                                                                                                       
#Content-Length            816040840                                                                                                                   
#Cache-Control             public, max-age=259200                                                                                                      
#Content-Type              application/octet-stream                                                                                                    
#Date                      Fri, 23 Sep 2022 03:22:41 GMT                                                                                               
#ETag                      "0x108CBCE87986A86D7E1FF5388091F8D93A750ECE1881D076087F7AA0032506CC"                                                        
#Last-Modified             Tue, 15 Oct 2019 17:36:50 GMT

# A Few second later:
$uri = "https://officecdnmac.microsoft.com/pr/C1297A47-86C4-4C1F-97FA-950631F94777/MacAutoupdate/Microsoft_Excel_16.30.19101301_Updater.pkg"
$webRequest = Invoke-WebRequest -Uri $uri -Method Head -UseBasicParsing
$webRequest.Headers

#Key                       Value
#---                       -----
#Content-Disposition       attachment; filename=Microsoft_Excel_16.30.19101301_Updater.pkg
#Akamai-GRN                ,,,0.0db809cb.1663903368.5d6e0ff
#X-Cache                   TCP_HIT from a203-9-184-13.deploy.akamaitechnologies.com (AkamaiGHost/10.9.4-44125806) (-)
#Connection                keep-alive
#Akamai-Cache-Status       Hit from child
#Strict-Transport-Security max-age=15768000 ; includeSubDomains
#Accept-Ranges             bytes
#Content-Length            816040840
#Cache-Control             public, max-age=259200
#Content-Type              application/octet-stream
#Date                      Fri, 23 Sep 2022 03:22:48 GMT
#ETag                      "ad649b9fa882d51:0"
#Last-Modified             Mon, 14 Oct 2019 16:01:22 GMT
#Server                    Microsoft-IIS/10.0
#X-Powered-By              ASP.NET 

When I saw MAUCacheAdmin redownloading files every time I had the idea that maybe storing the ETag between runs might be a better way but as you can see the ETag also changes.

Cheers, Nick

pbowden-msft commented 1 year ago

Hi @nixtar interesting find. It looks like the nodes at Akamai are slightly different. You'll notice from the X-Cache header that these two requests are coming from different Akamai. However, there's no logical reason why Last-Modified should be different.

Although irregular, I'm surprised that you're seeing issues in the MAUCacheAdmin script itself as it primarily leverages the Content-Length header to figure out if the local cached pkg is up-to-date. Can you confirm that you're actually seeing an issue with the latest published MAUCacheAdmin script? Note, there was a recent update in the header parsing because the Teams package was always being reported as corrupt. Make sure you have the latest version installed.

nixtar commented 1 year ago

Ahh my bad, I had a issue in the past where there was once a changed file in the CDN that did not update the length/name but MAU was refusing to use it ( I assume it hashes the file? ). I added in a date modified check in my own local fork and have since kept merging it with the latest changes. It had slipped my mind when I logged this.

Its interesting that its only now started being an issue. I wonder if whatever akamai peers we are hitting are having some issues. I've disabled the date modified checks for now. Would be cool if the file hash could be included in the collateral xml so a kind of self healing cache function could be written. Is that what the cat files are for?

For reference we have over 1300 remote sites each with their own MAU Cache.

pbowden-msft commented 1 year ago

@nixtar Thanks for letting me know - makes sense. There are some back-end optimizations going on with the CDN and Origin right now where we're moving to a different configuration, so while the problem you saw is unexpected, it doesn't totally surprise me. Yes, the CAT files are standard Windows Security Catalog files - they contain the filenames and hashes of each PKG and the XML itself. MAU uses these to reduce the possibility of MitM attacks. While they're very difficult to parse on a Mac, you can open the CAT file on a Windows machine and see its contents. Good point about putting the SHA256 hash of the PKG in the XML itself. I'll add that one to the wish list! I'm blown away by your set up of 1300 remote sites - that's awesome! If you're willing, I'd love to chat with you via conference call to discuss why and what you've got there - I think it would generate some new ideas for how we do things here. You can contact me via email - pbowden at microsoft dot com