chocolatey / package-validator

Windows service to validate packages conform to package standards
Apache License 2.0
31 stars 29 forks source link

URL Validation is failing for some valid URLs #200

Closed gep13 closed 4 years ago

gep13 commented 4 years ago

Examples of packages/URLs that are failing are...

chtof commented 4 years ago

Package: https://chocolatey.org/packages/jtalert/2.15.6 Nuspec: https://github.com/chtof/chocolatey-packages/blob/master/automatic/jtalert/jtalert.nuspec

"In the ReleaseNotes element in the nuspec file an invalid Url is found. Please correct this More...The LicenseUrl element in the nuspec file should be a valid Url. Please correct this More...The ProjectUrl element in the nuspec file should be a valid Url. Please correct this More..."

URLs: ReleaseNotes: https://hamapps.com/php/notes.php?file=jtalert LicenseUrl: https://hamapps.com/php/license.php ProjectUrl: http://hamapps.com/

AdmiringWorm commented 4 years ago

The validator believes that the bugTrackerUrl isn't a valid url in the package activepresenter

The url specified is valid though (both from web browser, and from powershell): https://talk.atomisystems.com/

AdmiringWorm commented 4 years ago

Another one that got affected.

The docsUrl on the mpc-hc-clsid2 package is also being detected as invalid. I assume this is happening because of the cloudflare check that occurs before actually being navigated to the page (fails in powershell, but succeed in web browser).

gep13 commented 4 years ago

@AdmiringWorm I am wondering if the mpc-hc-clsid2 is a separate issue. For that one, the response from CloudFlare is a 503 initially, and then a meta refresh/redirect of the page happens if found to be valid. That would be tricky to validate in the simple check that we are doing. However, I am wondering if a 503 response in the context of these validations is actually a valid response. What are your thoughts?

mkevenaar commented 4 years ago

@gep13 503 = Service Unavailable. This could be temporary or more permanent. There is no way to tell if it is correct.

gep13 commented 4 years ago

@mkevenaar said... 503 = Service Unavailable. This could be temporary or more permanent. There is no way to tell if it is correct.

The same could be said about a valid response 😄

The point I was trying to make was that it wasn't a 404 not found, so "something" is there, it just isn't working correctly just now.

gep13 commented 4 years ago

@chtof @AdmiringWorm I have found at least "part" of the problem. Going to raise a PR to address this just now...

mkevenaar commented 4 years ago

@gep13 said... The point I was trying to make was that it wasn't a 404 not found, so "something" is there, it just isn't working correctly just now.

What if we would allow all but 4xx status codes?

gep13 commented 4 years ago

@mkevenaar said... What if we would allow all but 4xx status codes?

Yip, that is what I am thinking, but again, I think this might be separate issue. Can I ask you to raise one for discussion?

gep13 commented 4 years ago

@chtof @AdmiringWorm @mkevenaar the PR in question is here: https://github.com/chocolatey/package-validator/pull/201

This "fixes" the issue for the URL that @AdmiringWorm provided, i.e. https://talk.atomisystems.com/, however, the URL's that @chtof provided are still causing an error to be returned...

image

I am wondering if something like the Agent needs to be set in order for the request to go through. Going to continue to investigate this.

gep13 commented 4 years ago

@chtof @AdmiringWorm @mkevenaar I have added another commit to the PR that I created. This adds the UserAgent header to the requests, which makes the examples that @chtof provided work during testing.

AdmiringWorm commented 4 years ago

@AdmiringWorm I am wondering if the mpc-hc-clsid2 is a separate issue. For that one, the response from CloudFlare is a 503 initially, and then a meta refresh/redirect of the page happens if found to be valid. That would be tricky to validate in the simple check that we are doing. However, I am wondering if a 503 response in the context of these validations is actually a valid response. What are your thoughts?

Yeah, maybe it should be considered a separate issue, but not completely sure though. In general 503 errors should not be considered a valid response, but in cases like these it probably would make sense to allow it but make it a Note for maintainers so they can verify that the url is actually working or not.

gep13 commented 4 years ago

@AdmiringWorm the way that the validator rules engine works doesn't allow for "something" to be either a requirement or a note. It either meets the requirement or it doesn't. Would need to put some thought into how this would work, if we needed to go that route.

mkevenaar commented 4 years ago

Package: https://chocolatey.org/packages/opera-developer/67.0.3554.0 Nuspec: https://github.com/mkevenaar/chocolatey-packages/blob/master/automatic/opera-developer/opera-developer.nuspec

URLs: ReleaseNotes: https://blogs.opera.com/desktop/changelog-for-67/#b3541.0

gep13 commented 4 years ago

@chtof @AdmiringWorm @mkevenaar I am going to go ahead and close this out just now, as I believe the two main problems in this area have been addressed. I have some other refactoring to do for URL validation in general, and once that is done, I will deploy the new version of package validator, and re-run validation on the packages that I know have been affected.

chtof commented 4 years ago

@gep13 I have read carefully your previous post. So this post shouldn't help and it's just FYI: https://chocolatey.org/packages/chirp/2020.01.03 https://chocolatey.org/packages/chirp.install/2020.01.03 https://chocolatey.org/packages/chirp.portable/2020.01.03 In the ReleaseNotes element in the nuspec file an invalid Url is found. Please correct this More... The DocsUrl element in the nuspec file should be a valid Url. Please correct this More... The BugTrackerUrl element in the nuspec file should be a valid Url. Please correct this More... The LicenseUrl element in the nuspec file should be a valid Url. Please correct this More... The ProjectUrl element in the nuspec file should be a valid Url. Please correct this More... The ProjectSourceUrl element in the nuspec file should be a valid Url. Please correct this More...

gep13 commented 4 years ago

@chtof @AdmiringWorm @mkevenaar with the exception of the activepresenter package, all other packages that have been mentioned in this issue have been re-validated, and are now passing correctly.

AdmiringWorm commented 4 years ago

@gep13 odd that the activepresenter package is still failing the bugtracker url, as this was one of the urls you added to the test library to ensure it succeed.

Is it possible that somehow the old original url is cached somehow and don't use the latest url?

chtof commented 4 years ago

Could you launch a rerun (checbox not displayed, ony reject) on chirp/chirp.install/chirp.portable (I can't and are still in failed status due to invalid URLs).

gep13 commented 4 years ago

@AdmiringWorm said... Is it possible that somehow the old original url is cached somehow and don't use the latest url?

I have included this URL in one of the new unit tests that I have, and it works fine on my machine. However, when running on the machine where the package-validator is running, it still fails to make a valid TLS/SSL connection to the site, and therefore fails. It is "something" missing on that machine I think, but I haven't yet figured out what it is. It is on my list to track down...

gep13 commented 4 years ago

@chtof said... Could you launch a rerun (checbox not displayed, ony reject) on chirp/chirp.install/chirp.portable (I can't and are still in failed status due to invalid URLs).

This should all be taken care of now, right?

AdmiringWorm commented 4 years ago

It is "something" missing on that machine I think, but I haven't yet figured out what it is. It is on my list to track down...

.NET 4.5 perhaps not installed on that machine? That version is minimum required to be installed to allow TLS 1.2 connections (.NET 4.8 for TLS 1.3).

gep13 commented 4 years ago

@AdmiringWorm nope, it isn't that 😢 That was the first thing that I checked. There is something else causing it to fail.