repology / repology-updater

Repology backend service to update repository and package data
https://repology.org
GNU General Public License v3.0
494 stars 173 forks source link

[New repository request] pacstall #1212

Closed AndersonTorres closed 2 years ago

AndersonTorres commented 2 years ago

Pacstall is The AUR Ubuntu never had.

Package manager's github page: https://github.com/pacstall/pacstall Package repository's github page: https://github.com/pacstall/pacstall-programs

Cross issue: https://github.com/pacstall/pacstall-programs/issues/257


I am opening this issue in order to ask for help, too.

When you talk at the requirements page

We expect package data to be in a machine readable format, which does not require complex parsing code, not mentioning execution of third party code.

how exactly is this package data retrieved?

Is it acceptable to, say, create a packages.json file at the root of pacstall-programs Github repository (the update of this file would be executed by some bot created by pacstall devteam)?

There are other alternatives?

AMDmi3 commented 2 years ago

Repology fetches files by http, it does not matter where it is located - you may place it the same repository as package recipes, in a separate repository, on github pages, on a website - all will go. You may not really want to place it in the same repo with recipes not to pollute its history with bot commits.

wizard-28 commented 2 years ago

Okay here's the script: https://sourceb.in/HcFeVkRDWi

Here's the syntax:

  1. ./pacparse.py

    For parsing all the packages in the repository.

  2. ./pacparse.py -p <path to packages>

    This will only parse the packages which are specified. Plus the -git packages (those need to be parsed often for accurate data).

Currently the resultant json is written to disk. We haven't decided where to upload it yet.

AMDmi3 commented 2 years ago

Timeout

wizard-28 commented 2 years ago

We have the API ready, it's on our website.

https://pacstall.dev/

The API endpoints are:

  1. /api/packages: Paginated JSON response of all parsed packages.
  2. /api/packages/<package_name>: JSON response for that package.
  3. /api/packages/<package_name>/dependencies: Dependencies for that package.

The API was generously written by @saenai255.

AMDmi3 commented 2 years ago

Will check it out tomorrow, thank you!

AMDmi3 commented 2 years ago
saenai255 commented 2 years ago

Hi @AMDmi3

I just read the requirements you linked before and it seems that our api doesn't fit well. I'll add a new endpoint, hopefully this week, that'll return all the packages in a format that follows your requirements.

Thanks!

AMDmi3 commented 2 years ago

Good. Note that pagination is not the only issue.

saenai255 commented 2 years ago

@AMDmi3 we released a new endpoint for repology. You can find the spec here

Waiting for your feedback :)

saenai255 commented 2 years ago

Ping @AMDmi3

AMDmi3 commented 2 years ago

It looks mostly good, but the hashes are still a major issue.

Mar0xy commented 2 years ago

The length of the -git hashes/version have been reduced to 8 characters

AMDmi3 commented 1 year ago

FYI, pacstall parsing is currently broken

obs-backgroundremoval-git: ERROR: Attempt to spawn Package with unset version

@Zahrun

Elsie19 commented 1 year ago

Should be fixed now.

AMDmi3 commented 1 year ago

Now

notion-git: ERROR: Attempt to spawn Package with unset version

If these problems persist I'll have to remove pacstall from repology.

Elsie19 commented 1 year ago

Ok I found the issue: It seems that notion changed their main branch from master to main, so I’ll fix that soon.

Elsie19 commented 1 year ago

Fixed. Are there any more that come up?

Elsie19 commented 1 year ago

Me and some other devs are discussing the removal of git packages from the API, as they are flaky and are dynamic. The other option we are considering is to simply remove the version from the API if it’s empty. Do either of these seem better for you?

AMDmi3 commented 1 year ago

Hiding git packages from API is not acceptable as it'll lead to incomplete repository information, and we don't tolerate that. Empty/missing version may be acceptable as long as repository treats it as expected. In that case I can handle it on Repology side by using git (or suggest another) placeholder, as Repology just can't work with empty versions.

Elsie19 commented 1 year ago

Ok we are going to add a filter that removes packages with no version. This will only apply to git packages (because they're dynamic), and we will probably setup something on our end to notify us of zeroed git package versions.

AMDmi3 commented 1 year ago

Ok we are going to add a filter that removes packages with no version

As I've just mentioned, this is not acceptable.

saenai255 commented 1 year ago

Currently we don't have a way to get semver formatted versions from git packages. Is it fine if we give the commit hash? or maybe supply an additional flag that marks the package as Git-based?

Edit: Each package already returns a type attribute. You could check the type and mark the version as git. All git packages use type Source Code

Elsie19 commented 1 year ago

Ok would using the literal text git for missing versions on our end work?

AMDmi3 commented 1 year ago

Let me elaborate.

Summarizing, if you set version to git for these versionless packages, and their rolling status can be detected based on type, it would be perfect.

AMDmi3 commented 1 year ago

@Henryws there seem to be more problems: https://repology.org/log/13924555 But let's better fix this in general as discussed instead of plugging individual ones.

Elsie19 commented 1 year ago

See, we do show a hash, the issue was that when you run git-remote ls $url master on a URL that has no master branch, it returns nothing, which is why this error pops up for you. Normally it displays a proper hash.

Elsie19 commented 1 year ago

@Henryws there seem to be more problems: https://repology.org/log/13924555 But let's better fix this in general as discussed instead of plugging individual ones.

henry@twilight ~ % ››› wget -q -O - https://gitlab.gnome.org/p3732/os-installer/-/commits/main | grep 'data-clipboard-text="' | head -1 | cut -f14 -d'"'
756339fdaef0e47091406011810939188cb5ef15

Huh, I'm getting a version

Elsie19 commented 1 year ago

Let me elaborate.

  • semver versions for all packages are not required
  • as long as you explicitly state that there are packages with no version, and [all of] these are always-latest-fetch-from-git packages, this is also OK
  • there are no requirements on specific protocol to convey this, but I prefer to interpret data from you instead of generating any data, so, given that repology requires at least something to show as a version, I prefer it to be set to e.g. git (or commit hash or whatever is convenient for you, as long as it's not too long)
  • repology also needs to set a rolling flag on these packages, which can be done based on version or a separate field
  • the only thing not acceptable is hiding some packages from repology

Summarizing, if you set version to git for these versionless packages, and their rolling status can be detected based on type, it would be perfect.

Would showing the hash for git packages that work, and showing the text git for missing versioned ones work? Obviously we'd fix those, but would that work?

AMDmi3 commented 1 year ago

Each package already returns a type attribute. You could check the type and mark the version as git. All git packages use type Source Code

type is set to Source Code to official releases as well, so we cannot use that.

Would showing the hash for git packages that work, and showing the text git for missing versioned ones work? Obviously we'd fix those, but would that work?

That would be an acceptable value for version, but I also need to reliably distinct these packages to handle them specially (otherwise most of these will be marked as outdated). I don't see anything usable for that purpose. It would work if you set version to git for all dynamic packages, or if you convey it some other way.