Closed agjohnson closed 3 months ago
For now, to get resolve this issue with pagerduty, I've repointed the project to my fork and disabled conda for the version. This is effectively not testing build-large for latency.
To revert my change, change the project repo back to this repo.
However, there might still be build errors with conda giving a 403 response on conda env create
.
@humitos noted a temporary fix here for the Conda URL. That would at least get us back to testing build:large latency at least. Multiple projects or some other hack to enforce build:large queue could be a secondary goal -- at least we don't have room for non-deterministic behavior that way.
I re-pointed the Read the Docs project to this repository and created #11. That PR fixes the Conda problem and start using build-large
again to build. This should fix the problem that we had.
Note that this fix does not create one Read the Docs project per queue as we talked. There is pros and cons in both approaches. The current one, also "test" that the task router works 😄
I think we are OK with the current solution. I don't think there is too much value at this point to split it in multiple projects. I'm de-prioritizing this work and removing it from my sprint. We can come back if we think it's strictly required, tho.
@humitos is this what you fixed this week with it being busted on .com?
@ericholscher Well, I solve the problem we had with the current implementation.
However, in this issue Anthony is suggesting to change the implementation to have multiple time-test
projects on the same platform. So, we can say they are different things.
Oh yea, I see, so a latency-check-build-default
and latency-check-build-large
project for example, and mark them explicitly 👍
Aye, the bug I hit originally was Conda failed randomly (I think a URL changed) and our build-large latency check broke as a side effect. It took some scrambling to figure out what the configuration was meant to be.
I re-pointed the Read the Docs project to this repository and created https://github.com/readthedocs/time-test/pull/11.
I agree this is close enough for now though. It should avoid some of the failures we've seen at Conda, and if this does pop back up, perhaps that's the point we can think about splitting up our projects.
We don't build-large
anymore, so this issue is a non-issue at this point. I'm closing it, but feel free to re-open.
Instead of using conda to trigger building on build-large, it might be better to have two separate time-test projects on RTD. Both would have the queue set on the project level, one pinned to the build-default queue, one pinned to the build-large queue.
This would avoid paging us on unrelated errors with conda.
Note: the background here is that an unrelated failure with Conda causes our latency checks to fail and I was paged for this. If testing Conda isn't important here, using separate projects seems like a great approach.
Changes: