readthedocs / readthedocs.org

The source code that powers readthedocs.org
https://readthedocs.org/
MIT License
8.02k stars 3.58k forks source link

Merge to the master triggers stable builds (two!) #8992

Closed skirpichev closed 2 years ago

skirpichev commented 2 years ago

See recent https://readthedocs.org/projects/diofant/builds/, e.g. this build. Note, that the merge commit was tagged as v0.14.0a1. I wouldn't expect any stable build (which is v0.13.0 tag) at all in this situation...

humitos commented 2 years ago

I think our code may be confusing an alpha version, v0.14.0a1, like the stable version. However, the package we use follows PEP440 (https://www.python.org/dev/peps/pep-0440/#pre-release-spelling) and this should not be interpreted as a stable version.

We will need to reproduce this issue and debug a little more to find out where this is happening.

skirpichev commented 2 years ago

this should not be interpreted as a stable version

It seems, it wasn't before. Last time, I think, the pre-release tag was v0.13.0a4. There is no stable build for this tag, unless I miss something.

stsewd commented 2 years ago

Note, that the merge commit was tagged as v0.14.0a1. I wouldn't expect any stable build (which is v0.13.0 tag) at all in this situation...

I don't see any v0.14 tags on your repo, and the build you linked (https://readthedocs.org/projects/diofant/builds/16251615/) says it built from https://github.com/diofant/diofant/commit/92766cf922e83185079b4411dfa633d7ed338dff, which is the v0.13.0 tag.

skirpichev commented 2 years ago

I don't see any v0.14 tags on your repo

Yes. Sorry, but the tag v0.14.0a1 was removed (somehow automated release creation was broken).

says it built from https://github.com/diofant/diofant/commit/92766cf922e83185079b4411dfa633d7ed338dff, which is the v0.13.0 tag.

That's correct. But that build apparently was triggered by pushing the a1 tag. (The merge commit has same time as pair of stable builds.)

skirpichev commented 2 years ago

@stsewd, just another example: pushing tag v0.14.0a2 produces two failing builds (1, 2).

humitos commented 2 years ago

@skirpichev I'm not sure what's the problem yet, but definitely something weird is happening here. Note that the second build you linked, https://readthedocs.org/projects/diofant/builds/16279777/, shows two different commits on the same page:

Screenshot_2022-03-07_10-31-07

Those commits should match. That probably comes the confusion of the two builds.

skirpichev commented 2 years ago

92766cf922e83185079b4411dfa633d7ed338dff - the merge 92bba0c66e73c13903c28538b234155b6caaffdf - tag v0.13.0 (for the above merge commit)

I don't know why the rtd shows the tag sha in the build log, but the main problem (in my opinion) is that the stable build is triggered. It shouldn't.

krassowski commented 2 years ago

I am not sure if related, but for some time now every single push to any pull request (and possibly to master too?) generates dozens of builds for me on https://readthedocs.org/projects/jupyterlab-lsp/ across versions, which leads to dozens of spam mails about failures and to running out of the concurrent job limit which leads to more failures. I don't think that push should trigger re-building of all versions. Is this connected, or should I open a new issue?

stsewd commented 2 years ago

I'll try to debug this a little more and see if they are related.

stsewd commented 2 years ago

I took a look at this today.

For https://readthedocs.org/projects/diofant/, looks like you created the github webhook "manually" (without connecting your account), and it's subscribed to push and create events (since rtd didn't create the webhook, it doesn't know that the webhook is subscribed to both events), and both events are triggered when a branch/tag is created, that triggers two "syncs". And for some reason, each sync operation is triggering a build to stable. You should be able to reduce the trigger of one stable by removing the "create" events from your webhook on github. Why does it trigger a build to stable? Looks like the tags are being re-created, that makes it appear like a new stable version was created, you can see this at https://readthedocs.org/projects/diofant/versions/ where there are a lot of tags suffixed with a letter (this happens when there are more than 1 version with the same name).

For https://readthedocs.org/projects/jupyterlab-lsp/, looks like you have an automation rule that enables all new created tags, and for some reason, tags are being re-created, so the rule is activating each tag, you can see this at https://readthedocs.org/projects/jupyterlab-lsp/versions/ where there are a lot of versions activated, and they are suffixed with a letter (this happens when there are more than 1 version with the same name).

Both problems point to something wrong with our "sync versions" code, and that is re-creating the tags (doesn't look like it duplicates the branches).

skirpichev commented 2 years ago

looks like you created the github webhook "manually" (without connecting your account)

(I doubt it so. But this hook was created long time ago and I can't refute your guess...)

Well, that I did right now: 1) webhook on the GH - was removed 2) same for GH integration of the RTD 3) I did set up the GH incoming webhook again. For some reasons I see now "The project diofant doesn't have a valid webhook set up, commits won't trigger new builds for this project. See the project integrations for more information." Not sure what this means: I did everything on the suggested docs page section (Integration Creation) and it seems the webhook was created on the GH side (with the correct URL, etc). UPD: after several minutes this message disappeared.

You should be able to reduce the trigger of one stable by removing the "create" events from your webhook on github.

@stsewd, Well, did an alpha release again. There are two stable builds again: 1st and 2nd.

Looks like the tags are being re-created, that makes it appear like a new stable version was created, you can see this at https://readthedocs.org/projects/diofant/versions/ where there are a lot of tags suffixed with a letter (this happens when there are more than 1 version with the same name).

Hmm, could you point to some specific wrong example on that page? All these mentioned tags (v0.1.2 and v0.1.2a3 like) are correct, describe unique versions (conforming PEP 440; "v" - is a common prefix for GH tags). Also, most tags (i.e. latest ones v0.14.0a3, etc) are signed by my GPG key. Do you mean that there is some internal to the RTD notion of "tag" (not same as the git tag), which was re-created?

stsewd commented 2 years ago

Do you mean that there is some internal to the RTD notion of "tag" (not same as the git tag), which was re-created?

Yes, rtd versions, not your repo versions (which are mapped to rtd versions). But looks like I got confused, you do have tags that have letters as suffix.

skirpichev commented 2 years ago

you do have tags that have letters as suffix

Yep, it's PEP 440.

stsewd commented 2 years ago

Found the problem. This is because git-python returns the hash of the "bare" tag, but when using git ls-remote we are taking the hash of the annotated tag. That mismatch on the hash makes it look like the version was updated, so we are re-building it.

We are using git-python when doing a full build, and ls-remote when doing a sync.

stsewd commented 2 years ago

https://readthedocs.org/projects/jupyterlab-lsp/ doesn't have annotated tags, so I'm still checking what's happening there.

krassowski commented 2 years ago

Some context on jupyterlab-lsp repo:

stsewd commented 2 years ago

Found the problem for that one, it's related to the same problem. https://github.com/readthedocs/readthedocs.org/pull/9019, wa may just disable the "lsremote" feature while we fix the issues.

stsewd commented 2 years ago

I have disabled the lsremote feature till we fix it, you should no longer experiment these problems (could take effect after one resync/build, depending on the state the versions were).

@krassowski let me know if you want me to deactivate all those versions created with a letter prefix for you.

stsewd commented 2 years ago

We have enabled back the lsremote feature, everything should be fixed now, but let us know if the problems are back!