commercetest / nlnet

Analysis of the opensource codebases of NLnet sponsored projects.
MIT License
0 stars 0 forks source link

Canonical format of github.com urls #17

Closed julianharty closed 5 months ago

julianharty commented 5 months ago

Context

At least one of the URLs in the dataset has a prefix of www. in the URL for the repo (it also has http: rather than the now commonplace https: protocol).

Before using a URL dynamically let's process the URLs for the github.com repos to remove any prefix and also replace http: with https:

Example

http://www.github.com/asicsforthemasses

The canonical form is https://github.com/asicsforthemasses

julianharty commented 5 months ago

PS: the project's repo is probably https://github.com/asicsforthemasses/LunaPnR

julianharty commented 5 months ago

Here are links to the SO answers that provided the basis for the approach