john0isaac / markdown-checker

Markdown Links Validation Reporting Tool
https://pypi.org/project/markdown-checker
MIT License
1 stars 1 forks source link

Add techcommunity to unchecked URLs list #78

Closed pamelafox closed 1 week ago

pamelafox commented 1 week ago

Unfortunately the new platform returns 400 with the programmatic requests:

response = requests.head("https://techcommunity.microsoft.com/blog/azurearchitectureblog/azure-opena\ i-landing-zone-reference-architecture/3882102") response.status_code 400 response <Response [400]>

So I think it has to be added to the uncheckable list :(

pamelafox commented 1 week ago

ACTUALLY: techcommunity does return a 200 if you do a GET, just not with a HEAD. So what if the link checker first does head, then tries GET if it gets a 400?

pamelafox commented 1 week ago

See revelation.

john0isaac commented 1 week ago

Sorry for the delay Pamela It's a hit and miss when it comes to relying on the status code 200

https://github.com/john0isaac/markdown-checker/blob/75e863d4c41c483eb8cfe9ae25bff8d6e37daf89/src/markdown_checker/urls.py#L30

we are doing head then falling back to get on 405s

but I'm thinking of expanding this more to not just fall back to get request on 405 and accept any 2XX code as a working url instead of just 200 as a solution to this issue.

i looked at an implementation for this in JavaScript and that's what they were doing so i went with it initially but it seems to not work here in Python so I will try my new idea. Let me know if you have any opinions about it.

pamelafox commented 1 week ago

Wouldnt it work in this case to execute the if for any 400 level status code instead of just 405?

On Thu, Nov 7, 2024 at 7:19 PM John Aziz @.***> wrote:

Sorry for the delay Pamela It's a hit and miss when it comes to relying on the status code 200

https://github.com/john0isaac/markdown-checker/blob/75e863d4c41c483eb8cfe9ae25bff8d6e37daf89/src/markdown_checker/urls.py#L30

we are doing head then falling back to get on 405s

but I'm thinking of expanding this more to not just fall back to get request on 405 and accept any 2XX code as a working url instead of just 200 as a solution to this issue.

i looked at an implementation for this in JavaScript and that's what they were doing so i went with it initially but it seems to not work here in Python so I will try my new idea. Let me know if you have any opinions about it.

— Reply to this email directly, view it on GitHub https://github.com/john0isaac/markdown-checker/pull/78#issuecomment-2463678454, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACIQUTNNFURHQBWPZXQODTZ7QUUZAVCNFSM6AAAAABRMGTC3KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINRTGY3TQNBVGQ . You are receiving this because you modified the open/close state.Message ID: @.***>