srevinsaju / zap

:zap: Delightful AppImage package manager
https://zap.srev.in
MIT License
513 stars 18 forks source link

Self Hosted Repositories #81

Open trymeouteh opened 2 years ago

trymeouteh commented 2 years ago

Please add support for self hosted Zap repositories. Were users can add or remove repositories within Zap and be able to download AppImages from these repositories.

Zap will come with the official Zap repository.

Other package managers do this such as Pacstall, Flatpak and F-Droid.

The self hosted repository server could should be also be fully open source.

srevinsaju commented 2 years ago

Zap will only support AppImages for now. Support for other package managers are not something this repository would like to pursue. We do not have any other trustable AppImage index otherthat appimage.github.io

misog commented 2 years ago

@srevinsaju

Github.com could be used as the index because it offers structured releases. AppImage trust/reputation could then be implied from Github stars. It could be similar model how people install .exe files found on Google search, where each software page has its ranking (like trust/reputation). The difference is that Zap would be the command-line browser.

To increase trust even more, "verified" Github.com accounts could be maintained by Zap or external list maintainer, for ex. when VSCodium project has official Github.com repo, it can be "verified" by Zap or external list maintainer. No need to maintain own lists and builds, just one flag.

More here https://github.com/srevinsaju/zap/issues/77

srevinsaju commented 2 years ago

AppImage.github.io are only references to GitHub.com and other source code repositories's release pages. appimage.github.io does not host any AppImages. Basically, appimage.github.io works as a verified and tested list of AppImage repositories

misog commented 2 years ago

In other words, AppImage.github.io is very expensive list of "verified" flags. Unverified projects are not available, they are not discoverable in Zap. For example, VSCodium is not available in AppImage.github.io / Zap even it has releases on Github.com.

So the problem is discoverability. Github.com solves this by providing search API. Separate problem is ranking - Github.com solves this by Stars. Another problem is "approval/verification" flag/badge, this is solved by AppImage.github.io however this limits other aspects.

So, a good package manager could:

  1. Discover all discoverable packages

  2. Rank them by criteria

  3. Sort them by the rank

  4. Manage installations

  5. Possibly, cache packages and other assets (images, descriptions, ...) locally or on servers.

Github.com and other distribution platforms could help with point 1. Github.com in combination with external lists which maintain verified packages (appimage.github.io) could help with point 2.

srevinsaju commented 2 years ago

As of right now, zap supports GitHub releases although it's not really implicit. zap install --github --from vscodium/vscodium vscodium installs vscodium from the official organization. We are also working on adding a simpler shortcut to install it, i.e., zap install vscodium/vscodium will automatically install vscodium from the organization vscodium without the extra flags. I am not sure if GitHub Stargazers are a "way" to differentiate between an official repo, and what not. If the suggested "stargazer+search" api from GitHub is being used, it adds a lot of entropy to the package installed, someone could attach the supply chain, and insert vulnerabilities. Another reason is that, zap does not claim to be a package manager built on GitHub. Sometimes, the above does not make the user responsible in the case where the source is compromised. But, I am interested to see if someone could come with a fair solution which does not favor GitHub alone, but also integrates an API which helps zap use other popular open source and proprietary VCS's like GitLab, Gitea, etc.

misog commented 2 years ago

The --from flag and its short alternative is very nice, however it does not solve discoverability. Discoverability is very important and it is the main aspect that package managers solve by robust lists and infrastructure. However, robust lists are like web search engines before Google and other modern search engines where records were managed manually, pages sorted alphabetically or by popularity which were manually maintained. Discoverability can not be solved with finite lists effectively. Google then implemented crawling, which automated the process - which is similar to API access.

As a user, I do not care where the package is placed; I just want to type one command with approximate package name and have the package installed. I may not know the exact name, so package manager should discover/search it for me. This is true while searching AppImages and .exe with modern search services such as Google. Somehow, the discoverability service will present me sorted results, Google uses its index which is based on multiple criteria.

I am not sure if GitHub Stargazers are a "way" to differentiate between an official repo, and what not.

Github Stars are very good basic indicator that I may want that package - because most people will search for popular packages. It is similar how Google ranks webpages by popularity or term searches. Yes, some malicious site (not hacked, but malitious form start) could get high rank in search results, however it is not probable at all. Similarly, popular packages on Github.com are popular because many people use them. Maybe there is even backdoor in the codebase, however people trust that they are no backdoors.

Incidents can happen on popular webpages, as well as in popular Github.com releases. Attackers could hack websites and Github.com accounts as well. It is responsibility of owners to secure their websites and Github.com accounts. No verification from Zap or appimage.github.io can ever prevent such attacks. Even now, Zap could happily install hacked/malwared packages and we would know in days, weaks, or never!

Moreover, I would argue that using appimage.github.io is a big security risk. Because attackers could just hack appimage.github.io one time and hack ALL the packages by changing source URLs. But Attackers will not hack ALL Github.com accounts at once.

Of course, Stars are just one indicator. Maybe, the fair rank method is package_stars + date_of_last_commit. Github.com provides many indicators.

This could be generalized for GitLab and others. At least they provide date_of_last_commit for sure.

Then, one JSON file placed on Zap server or appimage.github.io can provide "verification" of Github.com packages. This could give higher rank to discovered packages or Zap could offer flag --only-verified:

//  github.json
{
    verified: [
        'vscodium/vscodium',
        'something/somesoftware',
    ],
    malitious: [
        'firefoxx/firefox'
    ]
}

Third party verification services could be used by Zap. They just provide list of verified or malicious packages. Maybe Zap user could install own list for personalized protection. For example, software developer would install curated list which contains verified packages and graphic designer would install another list of curated packages.

But the point is, Zap users can decide for themselves to install any package on the internet (Github + GitLab + Bitbucket + ...). Just like they can search it with Google and install it or just like they can search it on Github.com, sort it by stars and install it :grin:

misog commented 2 years ago

@trymeouteh Sorry for spoiling the thread but if Zap supported Github.com integration then this would make all AppImages hosted on Github.com available in Zap and maybe it would be sufficient boost of new AppImages for your usecase. Other platforms such as GitLab could be integrated later.

Why do you want to have custom repositories? Would you like to have private repositories or would you like to just increase the number of available AppImages in Zap?

trymeouteh commented 2 years ago

@trymeouteh Sorry for spoiling the thread but if Zap supported Github.com integration then this would make all AppImages hosted on Github.com available in Zap and maybe it would be sufficient boost of new AppImages for your usecase. Other platforms such as GitLab could be integrated later.

Why do you want to have custom repositories? Would you like to have private repositories or would you like to just increase the number of available AppImages in Zap?

The reason for having repositories is for users to be able to use an official zap repo and other 3rd party repos the users trusts to install AppImages that are verified by the repo maintainers. As long the repo maintainer is trusted, this is a more secure way of installing Appimages than being able to install an AppImage from any github repo since the user would have to choose the right repo and not the wrong repo that was forked and could be malicious.

misog commented 2 years ago

@trymeouteh Are there other AppImage repositories in addition to appImage.github.io ? It should be defined what repository means here. In general (as understood in Linux distros), repository includes maintainers that:

  1. Decide what software is offered
  2. Maintain packages and handle releases
  3. Decide which package versions are offered
  4. Decide when packages are released
  5. React to security incidents (patch vulnerabilities, block package version, ...)

AppImage.github.io is not such repository because It just does one thing or possibly two things:

  1. Decide what software is offered
  2. (possibly) React to security incidents (remove package from the list)

appImage.github.io just links to Github.com accounts which offers AppImages and project owners do the rest. So it is not repository, just list of recommended Github software which support AppImage format.

AppImage repositories are not feasible nor needed to maintain (points 2, 3, 4), because the point with AppImage format is that you do not need to have maintainers as with Linux distros. AppImage is similar to like .exe or .app or .dmg in the sense that you just need one file and it can be executed. AppImages are compatible with most Linux distros.

So let's call it "recommended list of AppImages" and not "repository".

The philosophy behind AppImages User can search for AppImages on the internet and download it and execute it. For example, user will search "vscodium appimage" in Google, then verify that the domain is official domain of VSCodium project, then verify SSL certificate for that domain by looking at green lock near URL and then downloads the AppImage file.

Alternatively, user will search "vscodium" on Github.com, then click on the project with the most Github Stars, then verify that this is the official Github.com account of the project and then downloads the AppImage file.

This is the benefit of AppImages - users just go directly to the project authors and download AppImage.

Currently, Zap is not philosophically compatible with such approach because it does not allow to search for as many AppImages as possible. It limits search results by recommended list of approved Github.com accounts by appImage.github.io maintainers and that is the reason that AppImage projects are missing in Zap.

But yes, if Zap supported custom Github.com recommended lists then other people could easily create custom approved Github.com account lists with lower cost than appImage.github.io and personalized for their company/organization or category (graphic designer, programmer, sysadmin, home media center, ...)

I wrote JSON format for such list above, however now I think JSON is not really needed, it could be just simple file with lines:

# list of approved Github.com accounts with AppImage releases curated for artists
VSCodium/vscodium
Skrifa/Skrifa
feugy/melodie
trymeouteh commented 2 years ago

Does Zap do a checksum for every single appimage once downloaded to ensure it is not tampered with?

And if Zap supported recommended lists, will it supports other git sites such as gitlab and gitea and not be centralized to github only?

misog commented 2 years ago

I do not think so. Even if it did then there would need to be official or verified hashsum placed somewhere and Zap would need to discover it and understand the format. In February, this was discussed: https://github.com/AppImage/appimage.github.io/issues/2830 However this is not feasible and nowadays it is easy for software producers distribute packages themselves with build platforms. There is no need to split package discovery and package distribution.

I think that it is good idea to support recommended lists with arbitrary URL. For example, Zap would have flag such as --use-list=https://example.com/appimagelist.txt and then this file could be used to display badges in search results or modify search result order. However this depends on the decision of Zap author(s) to skip lists such as appimage.github.io and use software distribution platforms directly.

srevinsaju commented 2 years ago

One issue with verified lists are that, sometimes repositories get forked, and sometimes the repositories are moved to a different organization. It becomes hard to keep track, and keep the list up-to-date over time. Some repositories we add as verified, might lose AppImage releases over time, and some might not follow the official AppImage naming conventions / AppImageSpec standards. I am interested in doing something similar to what Flathub managed to do, see https://github.com/flathub, by doing so, we are asking the maintainers to be responsible for the AppImages they create, and those appimages that they want to put it on the Appstore. They can then write custom scripts to build the AppImage from the CI, and the publish them to the releases within a github organization that the zap / appimage community manages, similar to how flathub manages it.

All we would want the application maintainers to add is, link to the AppImage, or a script to build it, simplifying their release process by a lot. Then, we can create checksums automatically, gpg verification if necessary, etc. Doing this opens a door of many possibilities!

misog commented 2 years ago

That issue is real but it is general issue present also in repositories managed by maintainers and index like Github.com or appimage.github.io. Someone needs to be responsible for keeping the list up to date.

Regarding Flathub. I think that a Flathub clone will not be successful. Because, why not just use Flathub which already offers so many packages if Flathub clone with AppImages offers fewer packages?

AppImages design replaces the need for solutions such as Flathub and offers advantages which can not be offered by Flathub.

Flathub model:

  1. Software authors publish binary or source (Firefox, VSCodium, Krita, ...).
  2. Anyone (maintainers) can package binaries or sources of any software (Firefox, VSCodium, Krita, ...) in Flatpack format.

This needs to have huge administration resources and creates problems which require more administration resources:

If an application that belongs to you is being distributed without your involvement, please get in touch with the Flathub admins, so that we can discuss transfering ownership. Source: https://github.com/flathub/flathub/wiki/App-Submission

This is problematic because final Flatpack binaries could be modified by anyone (maintainers) in malicious way. Also they can be modified to be more platform specific, which is a good think but at the cost of possible malware, huge administration cost and centralized distribution online system. Flathub is similar to Linux distro repositories, it is just working for more that one Linux distribution.

On the other hand, AppImage format allows software authors to create one file and publish it where they want. No middleman. No maintainers. No disputes who owns the project. That is the advantage and disadvantage of AppImages.

With AppImages, discoverability of genuine AppImages must be solved by third party tools, for example, search engines or package managers.

appimage

Source: https://appimage.org/

srevinsaju commented 2 years ago

Hmm, you got a point. Do you have any ideas on what we can do on apps like krita? https://krita.org/en/download/krita-desktop/

misog commented 2 years ago

It may look like AppImage is suited for regular OS users, however that is not true. Regular users just use Ubuntu with Snap/Snapcraft/Snap Store or Flatpack/Flathub and they do not care.

AppImage is more decentralized with all the benefits and disadvantages, mostly around trust and platform specificity. There is no big player in the AppImage ecosystem, however it could find its place on the market.

There are five cases in the AppImage ecosystem (popular platforms are for ex.: Github.com, Gitlab.com, SourceForge.net, ... and they are always structured):

  1. Software authors build and publish AppImages in a structured format on a popular platform.
  2. Software authors build and publish AppImages in a structured format on a not popular platform.
  3. Software authors build and publish AppImages in a non-structured format on a not popular platform.
  4. Software authors build AppImages.
  5. Software authors do not build AppImages.

Case 5. is the worst because it needs huge resources to convince them to publish AppImages or to package AppImages for them. However in the future, more companies will join the AppImage ecosystem. Many will leave because their product does not run with the same experience on every platform. And many will produce multiple AppImage builds for multiple platforms.

Case 4. is not very good, because software authors need to submit them somewhere (where? how? is it worth it?, ...).

Case 3. is good because because third party recommended lists could emerge in the ecosystem - they could just link to genuine builds directly at HTTPS non-popular domains such as official websites of software projects (krita.org).

Case 2. is better because it is the same as 3. but with more automation. It would lower the cost of maintaining recommended lists. However this requires a standard. There was a discussion that maybe it is sufficient that software authors publish just hashes in a structured way and ecosystem would handle AppImage distribution while verifying authenticity in package managers: https://github.com/AppImage/appimage.github.io/issues/2830. However it is hard to convince software authors to do that at scale.

Case 1. is the best because it can be automatized and one integration could make many genuine AppImages discoverable for users.

So, Krita is case 3 because it offers URL on krita.org:

  1. Software authors build and publish AppImages in a non-structured format on a not popular platform.

It could be moved to case 2 or case 1 by convincing them or just implement something in AppImage ecosystem to handle case 3.

Case 3 can be handled by recommended lists from community. Maybe JSON format. Community can create such lists and host them online at HTTPS websites. Standards could help here. It is similar to appimage.github.io however appimage.github.io is a closed list and just links to projects and not AppImage files directly, which is problematic because Krita offers just URL of AppImage and not Github.com account: https://download.kde.org/stable/krita/5.0.6/krita-5.0.6-x86_64.appimage

(Krita is actually Case 1 with that kde.org popular platform but lets ignore this for now)

The format for a case 3 recommended list could be JSON and contain:

However with Case 1, it could be much simpler! Just Github.com URL or KDE.org URL is needed and everything can be automatically downloaded/cached/recached.

Take a look: https://download.kde.org/stable/ and https://download.kde.org/stable/krita/

So, Case 3 is all about building community around some JSON or other standard and Case 1 is all about building integrations of popular distribution platforms.

PS. I checked if Krita AppImage hosted on kde.org is actually build by authors and it looks like it is, so it is Case 1. They also offer community-maintained Flatpack hosted on Flathub. However the first choice they offer is AppImage so they want to offer their build with priority.

AppImage tools should offer ways to verify package integrity. For example, display krita.org domain and then user can verify somehow that the authors published the file on kde.org. It would be nice if on krita.org there is cryptographic checksum (hashsum) and it is automatically compared to the downloaded file grom kde.org, however that needs some standard rethink and authors-convincing.

Screenshot from 2022-07-24 15-14-37 Screenshot from 2022-07-24 15-20-04

srevinsaju commented 2 years ago

Hmm, good point on kde.org as a standard publishing format.

It is similar to appimage.github.io however appimage.github.io is a closed list and just links to projects and not AppImage files directly,

https://github.com/AppImage/appimage.github.io/blob/master/data/Krita

misog commented 2 years ago

Quote:

https://download.kde.org/stable/krita/4.3.0/krita-4.3.0-x86_64.appimage This is clearly not an ideal URL since it is specific to one version. Instead it should be specific to one channel, e.g., "release", "continuous", "alpha", "beta", etc.

That is not ideal URL for appimage.github.io, possibly the most expensive (per package) list ever maintained. They did a good job with the presentation data and so, but it can not scale. They maintain project graphics, images but they do not maintain latest AppImage file of a project? ...

URL pointing to one latest AppImage file is very ideal and sufficient for HTML websites, web search engines or package managers because the most important usecase is "to install latest AppImage release". install krita needs just URL of latest AppImage + name, search krita just needs URL of latest AppImage + name + one extra field for identification (example official project domain).

So, any AppImage discovery tool needs at minimum latest AppImage + project name for each project it can discover from recommended lists (case 3). Description, image, release hashes, official project domain, release history, etc are very useful information however they are not important for the goal of huge AppImage discoveries. Such information do not increase the number of discovered projects, often they can discourage people to update such information, they create inconsistencies (ex. one platform contains image gallery and another does not), ...

But I think that format of recommended lists should contain many additional fields such as Optional list of releases (structured, with dates etc), however the absence should not prevent it from inclusion. It would be the responsibility of maintainer to maintain such lists and not the responsibility of software authors to provide all information.

But this could be non-issue if AppImages are discovered on distribution platforms such as Github.com, KDE.org, etc. Then the latest AppImage can be discovered automatically and also other metadata such as description, image, etc can be scraped automatically. Also even there could be used recommended lists, for example list of Github.com repos or list of KDE.org project folders, however all the hard work can be automatized by scrapers and nothing except Github repo names / KDE.org folder names must be maintained by maintainers.

Trusted. AppImage format is ideal for upstream packaging, which means that you get the software directly from the original author(s) without any intermediaries, exactly in the way the author(s) intended. And quickly. Source: https://appimage.org/

Well, authors could send their latest AppImage on CD/DVD and ship it quick. Or just provide URL. Direct URL to AppImage is better and totally universal, but structured HTML/API is also parsable (such as Github or downloag.kde.org) even without their explicit permission or knowledge or any recommended lists. Public repos with proper license can be scraped automatically, KDE.org for sure.

misog commented 2 years ago

Hmm, good point on kde.org as a standard publishing format.

To clear possible misunderstanding, I meant that kde.org could be also integrated with AppImage discovery tools such as Zap in AppImage ecosystem. Not that kde.org should be used as a standard.

In addition to Github.com with their search API, kde.org is freely scrapable/pasable and it could contain many AppImages in archives. It is the question of integration/scraper/parser maintenance (case 1) vs community list maintenance (case 2, case 3).

srevinsaju commented 2 years ago

Right, gotcha. Re: 1, GitHub search API for example, we do know that the search API has a rate limit. Would we want to ask the users to insert a GitHub API token?

misog commented 2 years ago

Here I researched the two options:

  1. Github.com has API rate limits however they are sufficient. Because it takes just one (or three) API query/queries to search all Github.com for appimage string in description, readme.md or tags. And it looks like project description, stargazers_count, watchers_count, releases_url is a part of the response data: https://docs.github.com/en/rest/search#search-repositories Then one server could be set up which would scrape Github.com periodically (max every 6 seconds because rate limit for unauth. is 10req/min), cache or store content (in mysql, redis, json file, ...) somewhere (the server, AWS, CDN, ...). Then Zap users could just download data from there. I like the approach of apt-get update to download everything and then search local database instantly and it would also lower server load. To make it more usable, xxx search <package> could also fetch server data once when local copy is old and --no-update flag is not provided. However, Github.com repositories scraped this way are not yet confirmed to actually contain any AppImage files (ex. AppImage/appimage.github.io does not contain any AppImages but has the word appimage in description). So, just AppImage candidates are free to scrape and more scraping must be done to confirm if there are actually AppImage files in Github.com release section.
    1. REST API supports releases: https://docs.github.com/en/rest/releases (needs some calculations how many candidates how often can be checked with unauth. user)
    2. But maybe it is not needed to scrape this server-side and just let Zap to try to cURL/API Github releases it and parse it locally. If no AppImage files are found, then such repository can be reported and investigated, eventually included in false-positive list or ignored list. I guess releases parsing is implemented because Zap can install from Github.com URL? But this could be a privacy issue, if Zap reports to server which software was found to be false-positive because it reveals what that IP wanted to install. I would prefer try to install a package and do not report if it fails. Maybe Zap could ask the user to report failed install, this would be fair.
    3. Another approach would be to use a confirmed list of Github.com projects with AppImage release. This could be provided by the Zap community or AppImage community. Zap could contain one official list. But it is important that packages which are not on the list should also be installable if they really contain AppImage files. This confirmed list is different than recommended list mentioned before, it has different purpose. Confirmed list is just a list of Github.com projects which are confirmed to have AppImage release. Recommended list would contain subset of Github which its author recommends to be aware of (for example to move it higher in search results, or to have nice shiny badge - they should be also confirmed). Bolth list formats should be super simple, like .txt containing author/repository separated with newlines. With simple format, many community members can maintain their lists cheaply. This way even abandoned lists could be easily moved to other maintainers, just by copy/paste and with no special knowledge of JSON format or parsing tools. Ex. graphic designer will super easily create a confirmed list of painting software found on Github. Eventually it will merge to more popular list such as Zap confirmed list or other large confirmed lists. Another example: Graphic designer will easily create recommended list of vector editors found on Github. It can be shared on forums and other graphic designers could install the recommended list and discover new vector editors. In addition to cheap maintenance/create cost for community members, maintenance cost for maintainers is also very low because such lists can be sorted and duplicates removed and git DIFF is very readable. So the most expensive task is to actually check if there are AppImage releases - this can be automatized by a personal scraper with logged in Github.com user of maintainer.
  2. If Zap used Github.com user token then no problem with API rates. But this could be worse user experience because it could be slower to use it in comparison to pre-cached data. Also not everybody has Github.com account so that would not be very accessible.
  3. Third option is to use Github.com Search API by Zap with no Github.com credentials. Could work however I am not sure how Github counts usage, so organizations (ex. schools) with a single IP address could hit API rate limit quickly.

I like 1.ii because the community can help with false-positives directly from Zap and no confirmed lists must be maintained and this could be automatized very well by Zap. But also I like recommended lists from community, they could be generalized to contain URL of Github.com repository, Gitlab repository, KDE.org package folder, ... But format of the list must be very simple to be successful (no JSON, no XML, ...). Thankfully we have a standard for location of AppImage parsable resources: https://datatracker.ietf.org/doc/html/rfc1738 and thankfully Github.com and others have structured responses. Also it would solve the issue of this thread - Self Hosted Repositories by simplifying repositories to recommended lists and using structured platforms for distribution.

Later, the problem of duplicate AppImages can emerge. For example, AppImage is distributed on Github.com as well as on KDE.org. This could be solved by just ignoring the problem and user could install two instances of the same AppImage release (multiple versions of a Linux app in 2022, anyone?). Or it could be solved again by community with duplicate lists or comparing release files on server or in Zap.

misog commented 2 years ago

Hi, I encountered OpenShot video editor. It has no mention about appimage but it contain appimage releases: https://github.com/OpenShot/openshot-qt So maybe custom lists of known appimage projects in simple format is also a good idea.

Also Kdenlive has appimage releases https://files.kde.org/kdenlive/release/