Open camerojo opened 5 months ago
Fix 1 involved deleting visa job check records which were causing the SQL to return multiple values when just one was expected. The deleted records were stored in this TSV file - changed to txt because GitHub does not support uploads of tsv files. It has the candidate_visa_job_checks that we deleted to get prod to start.
tctalent_public_candidate_visa_job_check_6.txt
That helped - but the Flyway was still failing.
Fix 2. There was other bad data which caused the update to fail: multiple visa job checks for the same job. We executed queries to identify the bad data, then manually deleted the bad entries using the TC.
I reimported the deleted visa job checks from the above dump file to see if I could fix the data.
There was just one multiple value being returned from update select.
It turns out that this last problem was not due to bad data but a problem in the update SQL.
The problem came from the fact that a candidate can have more than one visa check (NOT visa JOB check) carried out based on country. For example a candidate may have been checked for basic eligibility to both Australia and Canada. For example see https://tctalent.org/admin-portal/candidate/76864
In that the case the Update SQL from Flyway will fail because multiple values will be returned in the select - one for the visa check for Australia and one for the visa check for Canada. The correct update SQL needs to add an extra restriction to the select's where clause specifying that the visa check country should match the job country.
update candidate_opportunity co set relocating_dependant_ids = (select relocating_dependant_ids from candidate_visa_job_check cvjc join salesforce_job_opp jo on cvjc.job_opp_id = jo.id join candidate_visa_check cvc on cvjc.candidate_visa_check_id = cvc.id join candidate c on cvc.candidate_id = c.id where co.candidate_id = c.id and co.job_opp_id = cvjc.job_opp_id --> and jo.country_object_id = cvc.country_id); <--
I manually ran the above update against the production database (with the previously deleted data reimported) and it ran without error.
@cazcam34 @sadatmalik
There were actually two reasons that the Flyway failed:
Forgot to add password for new email noreply@talentcatalog.net to prod Task deployment. Should have been in these release planning notes but I forgot to add it.
Updated now in prod task definition (40) - and restarted prod.
Two showstopper bugs surfaced today which have been fixed and pushed into production.
Candidate synch started:
For 119469 candidates:
Process 1 - 598 pages:
This is the last log entry - for page 597:
We seem to be missing the last page and the summary log message at the end of the sync.
These are the 10,000 we tested with - they completed fully with the corresponding log completion:
This is the scheduled sync which seems to have stopped at page 597 of 598:
Found the culprit - Out of Heap space:
As the tech team we need to capture somewhere special manual tasks that have to be carried out for a release so that we don't forget
Typically we will carry out these tasks progressively over time on staging, but all at once in production.
We should add the tasks as a task list as we carry them out on staging, then when we release into production, we just copy the task list, and mark them off as we complete them on production.
This type of issue starts in Review/QA so that we can always easily find it. It is only moved to Done when the production release is completed.
Staging task list
Post-release:
New Translations
Cloud Cache
Build Elasticsearch Index with Infra scripts
Run SystemAdminAPI fix_null_case_sfids
Salesforce tidy up
See #1335
Instructions: Production task list
To be copied from staging task list above, then worked through during the production release. Translation text can be copied from Staging.
Production task list
Post-release:
sf-update-candidates/200-0-50
Cloud Cache