Open johnmhoran opened 7 months ago
Remove /blob
from the URLs:
So use instead:
f'https://raw.githubusercontent.com/CocoaPods/Specs/master/Specs/{hashed_path}/{name}/{version}/{name}.podspec.json'
BTW, some of the URL data in these four .json files is not (or no longer) valid,
This would be another issue entirely ... We should have separate ways to crawl and tag invalid or dead URLs, and this would be implemented likely in the PurlDB, as some improver.
In connection with a
purl2url
issue inpackageurl-python
, I've been exploring the URL-related code in packagedcode's cocoapods.py. With the four PURL spec examples for cocoapods,I got the following results looking for potentially useful JSON files. Using the pattern
from the
get_urls()
api_data_url
variable, we get the following URLs, each of which leads to a404: Not Found
page:A few lines above that pattern in
get)url()
is a pattern for thespecs_json_cdn_url
variableFor the same four cocoapods PURLs, this pattern generates valid URLs to cdn.cocoapods.org JSON files:
BTW, some of the URL data in these four
.json
files is not (or no longer) valid, e.g., the "homepage" URL for ShareKit (http://getsharekit.com/) leads to what might be a Turkish-language page -- https://smartem.org/. (The "source"/"git" URL (https://github.com/ShareKit/ShareKit.git) is valid and reflects that the last commit was made in December 2017.)