webhintio / hint

💡 A hinting engine for the web
https://webhint.io/
Apache License 2.0
3.62k stars 670 forks source link

Add support for ASP.NET Core's URL fingerprinting syntax #3794

Open madskristensen opened 4 years ago

madskristensen commented 4 years ago

🚀 Feature request

ASP.NET Core uses a URL parameter to store the file hash, like this:

/file.js?v=JW_qcvnsPVjH-EA6ctfy9uy8HGXBjPIwokOhCirSWAE

It would be great to have native support for ASP.NET Core. Right now, ASP.NET Core apps get errors due to having a different fingerprinting/file revving pattern.

I've made a regex that should work. It is (\?|&)v=[\w-]+

It will identify the pattern for the following URL strings that I've tested it with:

/extensions/fcc-f339fe606-9d51-4fca-895c-d50375137b62/icon.png?v=JW_qcvnsPVjH-EA6ctfy9uy8HGXBjPIwokOhCirSWAE
/test/icon.png?v=JW_qcvnsPVjH-EA6ctfy9uy8HGXBjPIwokOhCirSWAE&hat=4
/test/icon.png?hat=4&v=JW_qcvnsPVjH-EA6ctfy9uy8HGXBjPIwokOhCirSWAE
/test/icon.png?hat=4&v=JW_qcvnsPVjH-EA6ctfy9uy8HGXBjPIwokOhCirSWAE&ost=4
/manifest.json?v=1

Let me know if I can help in any way.

/cc @danroth27 @davidfowl

molant commented 4 years ago

Hi @madskristensen,

From the hint documentation we link to this article.

Even though it's from 2008, I think it's still valid. That said, it doesn't take into account that https traffic will (should?) not have this problem. At the same time, I don't think we should be checking for https or not in this rule as that could provide conflicting results when testing locally or in production. And even if this only affects a small percentage of the web, it could still be millions of users...

@Malvoz what do you think? @antross is in paternity leave so don't think he will look at this until his back. Is there any data (http-archive?) that we could look at to help us make a decission?

Malvoz commented 4 years ago

The advice in the article is fairly outdated at this point in time. Squid updated their default behavior a little over 10 years ago (source: http://www.squid-cache.org/Versions/v2/2.7/RELEASENOTES.html#s1):

The default rules to not cache dynamic content from cgi-bin and query URLs have been altered. Previously, the "cache" ACL was used to mark requests as non-cachable - this is enforced even on dynamic content which returns cachability information. This has changed in Squid-2.7 to use the default refresh pattern. Dynamic content is now cached if it is marked as cachable. You should remove the default configuration lines with QUERY (acl, and cache) and replace them with the correct refresh_pattern entries.

If other proxies had the same behavior but didn't change even due to the commotion 10 years ago, then I don't think they're very reliable proxies either way.

Malvoz commented 4 years ago

I wasn't able to find any existing queries relating to URL query strings in the HTTP archive :/

If we look at it like this: Google Fonts is probably one of the most used web services that utilizes query strings in their URLs (https://fonts.googleapis.com/css?family={..}), if query string URLs weren't cachable in enough proxies then wouldn't we have heard from developers for years how fonts weren't being cached?

molant commented 4 years ago

I believe Google Fonts don't work correctly in China (or at least I had issues with this in the past). My main concern is giving an advice that could make the experience worse in countries where the Internet infrastructure might be different to what we are used to.

madskristensen commented 4 years ago

@molant I don't believe that had anything to do with URL parameters. Just to be clear, my suggestion here is not that webhint should advice people to put fingerprints in URL parameters, but just recognize that it is a valid and functioning way to fingerprint - such that it won't call it out as missing because it is an unrecognized pattern.

hxlnt commented 4 years ago

We chatted about this this morning, and we're thinking we can downgrade this warning to a hint to mitigate the concern while still surfacing the message for those who would like to see it. Thoughts?

madskristensen commented 4 years ago

@hxlnt Would that deprive users from getting the satisfaction of getting to green/clean?

molant commented 4 years ago

It will depend on what the user has configured as the minimum threshold. IIRC the default one for the browser extension is warning so it will not show up unless they change it.

madskristensen commented 4 years ago

@molant in that case it sounds great IMHO.

molant commented 4 years ago

This might take a bit because of other priorities but if you want to do a PR we will be happy to help you along the way!

madskristensen commented 4 years ago

Sure, can you point me to the right file(s)?

molant commented 4 years ago

Sure. The file to modify would be this one.

The idea will be to detect the finger printing syntax after cacheRevvingPatterns fails. If it is detected then change the severity from Severity.warning to Severity.hint a couple lines later from where I pointed.

Ideally there should be a new test here as well 😊

Hopefully the Contributor Guide is clear enough. If it isn't please let us know so we can address that (you can file a new issue or post a comment here and we will open it).

Thank you!

molant commented 4 years ago

Hi @madskristensen

did you have the time to look into this?

madskristensen commented 4 years ago

@molant sorry, life happened, and I never got around to doing this. I won't be able to do this in the foreseeable future.

madskristensen commented 4 years ago

@molant sorry, life happened, and I never got around to doing this. I won't be able to do this in the foreseeable future.