repology / repology-webapp

Repology web application
https://repology.org
GNU General Public License v3.0
194 stars 26 forks source link

search.cpan.org now redirects to metacpan.org #217

Closed oalders closed 1 year ago

oalders commented 1 year ago

search.cpan.org links still work, but they are redirected to metacpan. This commit removes the need for redirection.

Leont commented 1 year ago

I had just opened a ticket about that, but apparently in the wrong repository.

oalders commented 1 year ago

Thanks, @AMDmi3. MetaCPAN now links back to Repology as well.

AMDmi3 commented 1 year ago

@oalders Thank you! What do you think of Repology dropping CPAN support completely? It should be completely superseded by MetaCPAN, but I'm hesitant since there are discrepancies in packages and versions. For instance,

These may suggest problems in either repology or metacpan.

For instance, I'm particularly puzzled with entries like these ones:

  {   
    "_id": "7E8HHzhCscGW60u0I_QAcdhWkjU",
    "_score": 1,
    "_type": "release",
    "_index": "cpan_v1_01",
    "fields": {
      "abstract": [
        "Perl extension mainpulating XML based Visio files"
      ],
      "status": [
        "latest"
      ],
      "maturity": [
        "released"
      ],
      "distribution": [
        "Visio"
      ],
      "name": [
        "Visio-1.010"
      ],
      "download_url": [
        "https://cpan.metacpan.org/authors/id/A/AA/AAKHTER/Visio-1.010.tar.gz"
      ],
      "author": [
        "AAKHTER" 
      ],
      "license": [
        "unknown"
      ],
      "version": [
        "1.008"
      ] 
    } 
  },  
  {   
    "_type": "release",
    "fields": {
      "status": [
        "latest"
      ],
      "license": [
        "perl_5" 
      ],
      "maturity": [
        "released"
      ],
      "download_url": [
        "https://cpan.metacpan.org/authors/id/Z/ZI/ZIGOROU/Data-ClearSilver-HDF-0.04.tar.gz"
      ],
      "abstract": [
        "Convert from Perl Data Structure to ClearSilver HDF"
      ],
      "distribution": [
        "Data-ClearSilver-HDF"
      ],
      "name": [
        "Data-ClearSilver-HDF-0.04"
      ],
      "version": [
        "0.03"
      ],
      "author": [
        "ZIGOROU"
      ]
    },  
    "_index": "cpan_v1_01",
    "_score": 1,
    "_id": "PIbpBI4Q1pax3FmMI4A_dKdAQr0"
  },  

why's version field different from version in other fields? Where should repology take version from?

haarg commented 1 year ago

Regarding mismatched versions:

CPAN dists can have versions listed in multiple places.

The first is the version listed in the individual modules. Since CPAN dists can contain multiple modules, these versions are not always the same for a given dist. This is what gets used when PAUSE (the CPAN upload server) indexes the releases.

The second is in META.json or META.yml files shipped with the dist. This is what is reflected in in MetaCPAN's API for the version field. This is not used at all by PAUSE, or by the normal CPAN installation tooling. Ideally, this would always be accurate, but authors sometimes don't properly update their metadata. This is what has happened with the Visio and Data-ClearSilver-HDF examples shown.

The third is in the release tarball name. While the value of this is not used by PAUSE or CPAN installation tools, PAUSE does require uploaded files have a permanently unique name. So a failure to update the version in the tarball name results in a failed upload. Due to this, it is often more trustworthy than the version taken from the metadata. We should probably add this value to MetaCPAN's API.

Obviously this is all rather ugly.

oalders commented 1 year ago

What do you think of Repology dropping CPAN support completely?

I think that's a worthy goal to work towards. I think it would be worthwhile to take a close look at the packages involved first. There might be some interesting edge cases in there. HTML::QuickCheck looks to be one of those: https://explorer.metacpan.org/?url=%2Frelease%2FYLU%2FHTML-QuickCheck-1.0b1

MetaCPAN has "main_module" : "HTML'QuickCheck". The ' as a package separator is legal, but we should be normalizing the name to HTML::QuickCheck.