Open diox opened 2 hours ago
There has been attempts to use a sanitized copy of prod in stage before, but that is a much larger scope, and the thing that stopped us before was that we'd want to guarantee that data doesn't contain PII, as more people have access to stage compared to prod.
Thoughts about creating more add-ons/versions:
Addon
, Version
and File
instancesaddon_factory()
The main differences are going to be on details like activity log (could be useful to have more of these on stage as well!).
We don't necessarily need to have XPIs attached to those versions, but it would be more realistic.
'file': ContentFile(raw_file_contents, name='addon.xpi')
to the File
creationaddon_factory()
can support passing a file_kw={'filename': 'addon.xpi'}
At the very least the blocklist requires the File
instances to have is_signed=True
. The question of how realistic we want the data to be comes up again: should we actually sign those files with autograph ? Perhaps having them go through auto_approve
as normal ?
Should there be a single author ? Multiple ones ?
Splitting this into the current short term need and the long term desire to fix this issue:
short term: I would bias for whatever is quickest. bloom filter does not care about activity logs or actual XPIs just that the data matches the query of signed versions with block_version records so maybe a script creating some arbitrary data is good enough.
long term: I think we should have a strong bias for creating the data as closely to how production creates data as possible (via the API) as in general we don't know what the requirements of a given form of testing are and so each step we deviate from production is another degree of freedom for our "stage tests pass" to be a false positive.
In terms of how to protect PII this is a problem, but not a unique to us problem. I have seen in the past (for example)
I'm sure there are other alternatives to this as well. These are just ideas at this point but I think this is something we should prioritize as a tech improvement especially since we invest so much trust into testing things on stage. If stage doesn't catch bugs then what's the point?
OTOH: There are strong arguments against using staging environments at all. This exact problem is one of them. if staging is "equivalent" to production, then bugs that would occur in production will also occur in staging. That makes sense so long as the first part is true, which is an ever increasingly difficult task as we see here with this problem.
short term: I would bias for whatever is quickest. bloom filter does not care about activity logs or actual XPIs just that the data matches the query of signed versions with block_version records so maybe a script creating some arbitrary data is good enough.
I'm worried about creating a mess that we would later have to clean up though...
If the files don't exist, then for instance you won't be able to download the xpis, so code manager/assay links no longer work and could be confusing if QA (or us, 6 months from now) stumbles upon one of these add-ons. Or it could cause a future data migration or task to fail in some unexpected way, etc.
In order to verify #15170, could we block a large set of themes, and later undo this?
This might make things more annoying, because we need to then unblock them, unreject their versions, but themes are manually approved and unrejecting doesn't set versions back to public, so this would mean we'd also need to re-approve them...
I'm leaning towards calling addon = addon_factory(version_kw={'channel': amo.CHANNEL_UNLISTED}, file_kw={'is_signed': True})
and block_factory(guid=addon.guid)
a bunch of times.
Description
Stage doesn't have enough blocked add-on versions to reproduce potential issues that prod could be facing. We should create more fake data on stage to address this: a million blocked extension versions on stage would be a good (arbitrary) target.
Acceptance Criteria
Checks
┆Issue is synchronized with this Jira Task