percy / python-percy-client

[Deprecated] Python client library for visual regression testing with Percy.
https://percy.io/docs/clients/python/selenium
MIT License
21 stars 24 forks source link

don't use CIRCLE_WORKFLOW_WORKSPACE_ID for parallel nonce anymore #64

Open caffodian opened 5 years ago

caffodian commented 5 years ago

First off - thanks for the integration. Percy has been super useful to us and has helped us be more proactive in finding visual differences.

I've been dealing with a maddening issue for a few weeks and I think I have a solution...

When running tests in parallel on CircleCI (such as using the nose-parallel plugin,) retrying a job will result in failures such as follows:

HTTPError: 409 Client Error: Conflict for url: https://percy.io/api/v1/builds/0000000/snapshots/

or requests.exceptions.HTTPError: 409 Client Error: Conflict for url: https://percy.io/api/v1/builds/0000000/finalize

It appears that in retries, CIRCLE_WORKFLOW_WORKSPACE_ID does not actually change. I don't think we should be using it as a parallel nonce.

Without this fix, when we retry a selenium job on circleci, all our percy snapshots will fail. We have to do a strange workaround of making an extra commit just to get the snapshots to be submitted properly.

thanks!

caffodian commented 5 years ago

I've done some manual testing and it seems like CIRCLE_WORKFLOW_WORKSPACE_ID is always preserved on retried jobs, while the BUILD_NUM is new. This seems to resolve the issues, at least within our project... I haven't been able to find any useful supporting documentation, however.

caffodian commented 5 years ago

Any chance anyone could take a look at this? It's pretty disruptive to have to create dummy commits whenever we want to just retry a circleCI job. Thanks!

fotinakis commented 5 years ago

hey @caffodian thanks for the report -- I can give you a couple workarounds here, but we'll need to debate a bit more internally about actually switching this across all our SDKs (see some history in https://github.com/percy/python-percy-client/pull/45#issuecomment-431463650)

Workaround 1: set the PERCY_PARALLEL_NONCE environment variable dynamically in your script runner to whatever you want. For example, PERCY_PARALLEL_NONCE=$CIRCLE_BUILD_NUM. This will override the use of CIRCLE_WORKFLOW_WORKSPACE_ID.

Workaround 2: use our CircleCI orb and "finalize all" mode: https://docs.percy.io/docs/circleci#section-parallelized-tests-configuring-your-circleci-orb

caffodian commented 5 years ago

Thanks for the response. I saw some of the debate, but it seemed that perhaps the wrong conclusions were drawn... or maybe the rebuild case wasn't considered? I'm also quite frustrated at the lack of documentation on what the correct env variable would be.

Workaround 2: use our CircleCI orb and "finalize all" mode: https://docs.percy.io/docs/circleci#section-parallelized-tests-configuring-your-circleci-orb

This second workaround doesn't actually fully work - if the build was finalized, and then you retry it, all snapshotting tests will still get a 409, because a new build number is still not generated.

The first workaround might work, but in our case would need some fudging because we'd ideally want to keep our script runner the same, but run it with different settings through the circleci config. Then because you can't really create new env variables that reference the circleci built in ones, it gets really messy. I could possibly manually set this env variable for all jobs that actually do things with Percy, but it's a bit gross of a workaround. I'll think more on this.

thanks again! Hope there's a better solution in the future

caffodian commented 5 years ago

Bump - do you have any advice on why the orb didn't actually fix my issue above?

Robdel12 commented 4 years ago

👋 Hey @caffodian just noticing this PR! If this is still an issue (I hope not!) the circle env var you'll want to use is CIRCLE_WORKFLOW_ID. That will change on rebuilds also

Robdel12 commented 4 years ago

If you wanted to rebase this PR, put the code back for CIRCLE_WORKFLOW_WORKSPACE_ID and change it to CIRCLE_WORKFLOW_ID I'll merge & release 👍

caffodian commented 4 years ago

If you wanted to rebase this PR, put the code back for CIRCLE_WORKFLOW_WORKSPACE_ID and change it to CIRCLE_WORKFLOW_ID I'll merge & release 👍

Thanks! I'll give it a try with our setup first - we use a parallel nose setup on circleci so it'll fail pretty loudly if this doesn't behave the same.

caffodian commented 4 years ago

I just gave it a check, and this doesn't actually work. In fact, with the parallel setup, this is actually worse. It seems that:

CIRCLE_WORKFLOW_ID is shared across parallel builds. (the parallelism setting in .circleci/config.yml), and also in different jobs within the same workflow.

Therefore, using it as the parallel nonce has the effect where (if you have that setup,) only the first of a job within the workflow can actually complete.

In our case, we have two Selenium jobs within the same workflow, each with a paralleism of 4 - once the first parallel job finishes, none of the others can write snapshots, getting conflict errors etc.

Robdel12 commented 4 years ago

once the first parallel job finishes, none of the others can write snapshots,

Hmm, this suggests the parallel total is incorrect. What is PERCY_PARALLEL_TOTAL set to? The easiest way to configure the total is to set it to -1 and then add a step after all tests have completed to finalize the Percy build: npx percy finalize --all (or use the CircleCI orb, which does exactly that: https://circleci.com/orbs/registry/orb/percy/agent)