pulsar-edit / package-backend

Pulsar Server Backend for Packages
https://api.pulsar-edit.dev
MIT License
11 stars 11 forks source link

[BUG] Publishing failing because it's trying to read package data from the default branch #205

Closed mauricioszabo closed 1 week ago

mauricioszabo commented 8 months ago

Is this Bug Present in the upstream API Server?

What is the Bug When publishing a package, we're trying to read from the default branch. Which means that if I publish a package in a specific tag, the backend will not read package.json from that tag.

This breaks situations like monorepos (for example, https://github.com/mauricioszabo/star-ring) because there is no "root package.json" on the default branch, but only on specific tags

How to Replicate the Bug

  1. Make any package but not include package.json
  2. Branch from the default branch, and commit a package.json there
  3. Create a tag
  4. Try to publish the tag
  5. It will not publish, even if all fields are present
confused-Techie commented 5 months ago

I know this issue is older, but it is something I've been looking into.

After playing with a few different ideas on how we could do this, I think below is probably the best way:

We use the content rest API endpoint with the ref tag associated to the tags endpoint return.

Meaning for the example of star-ring we would need to get the tag data first:

https://api.github.com/repos/mauricioszabo/star-ring/tags

[
    {
        "name": "generic-lsp@2024.02.10-04",
        "zipball_url": "https://api.github.com/repos/mauricioszabo/star-ring/zipball/refs/tags/generic-lsp@2024.02.10-04",
        "tarball_url": "https://api.github.com/repos/mauricioszabo/star-ring/tarball/refs/tags/generic-lsp@2024.02.10-04",
        "commit": {
            "sha": "ef9aed5d82df0825453e9a3754d9b85023be6bdb",
            "url": "https://api.github.com/repos/mauricioszabo/star-ring/commits/ef9aed5d82df0825453e9a3754d9b85023be6bdb"
        },
        "node_id": "REF_kwDOKQyg_toAI3JlZnMvdGFncy9nZW5lcmljLWxzcEAyMDI0LjAyLjEwLTA0"
    },
    {
        "name": "generic-lsp@2023.09.08-00",
        "zipball_url": "https://api.github.com/repos/mauricioszabo/star-ring/zipball/refs/tags/generic-lsp@2023.09.08-00",
        "tarball_url": "https://api.github.com/repos/mauricioszabo/star-ring/tarball/refs/tags/generic-lsp@2023.09.08-00",
        "commit": {
            "sha": "183dbc7f52dae7357d25da8019ab890cea2130ee",
            "url": "https://api.github.com/repos/mauricioszabo/star-ring/commits/183dbc7f52dae7357d25da8019ab890cea2130ee"
        },
        "node_id": "REF_kwDOKQyg_toAI3JlZnMvdGFncy9nZW5lcmljLWxzcEAyMDIzLjA5LjA4LTAw"
    },
    {
        "name": "generic-lsp@2023.06.09-16",
        "zipball_url": "https://api.github.com/repos/mauricioszabo/star-ring/zipball/refs/tags/generic-lsp@2023.06.09-16",
        "tarball_url": "https://api.github.com/repos/mauricioszabo/star-ring/tarball/refs/tags/generic-lsp@2023.06.09-16",
        "commit": {
            "sha": "093420dc23e3d2b4e6d8d9e157acd983330072cf",
            "url": "https://api.github.com/repos/mauricioszabo/star-ring/commits/093420dc23e3d2b4e6d8d9e157acd983330072cf"
        },
        "node_id": "REF_kwDOKQyg_toAI3JlZnMvdGFncy9nZW5lcmljLWxzcEAyMDIzLjA2LjA5LTE2"
    }
]

From here we would then utilize the name value within each tag object to collect the contents:

https://api.github.com/repos/mauricioszabo/star-ring/contents/package.json?ref=generic-lsp@2024.02.10-04

This gets us the same return as previous usage of the contents endpoints, but has pretty large implications to the existing system here.

Currently when the user attempts to publish a package we get all the data in discreet steps:

But if we wanted to make this work instead, to make sure we collect all data from the tag instead of the branch, it'd need to look like:

This would actually solve the issue of us having to "fake" version data for previous versions during first time publication. Since on first publication we publish all previous versions of the package (as determined by the tags). So that would be a net benefit.

Although it does introduce a new issue. We would need to store even more tag data on each version to allow the feature detection checks to work successfully after the fact. Since feature detection also works by reading the contents of the repository. But these checks happen after publication and return to the user, so that they don't have to wait on extra steps like this. So we would need to store the name of each tag so that we can collect it's information later during feature detection. Luckily there was some foresight when creating the DB schema that has a freeform meta column on each version entry that accepts JSON. So that could be added there to avoid the longer process of having to update the live database schema.


Anyone reading this may think the obvious simpler alternative would be to download the tarball of the data and read it locally, and I still am partially considering that. Except that we would need to add the dependencies to be able to read or extract the tar data, and then have to be very careful we only ever have any read and write behaviors within the /tmp storage, since GCP App Engine containers (that we use) are read only.


Otherwise, this is mostly a note to myself about what should be done, so it's not forgotten after some research. But otherwise if anyone has ideas feel free to contribute, but no need to do so.