Closed ArogeG closed 1 year ago
Debugging screenshot in Prod RDS database:
https://app.zenhub.com/files/486378283/3f084673-3e02-49ea-bebb-653a49a2dcfd/download
Added prod log from Basil. https://app.zenhub.com/files/486378283/81d5ecb1-790e-4473-8fbc-144343c120d2/download
From the log and prod data: Backend has a check on if user exists, based on 'user_type_code' (B or I) and 'user_name' (ilike - case insensitive), before creating a user. For this case from production database query, it shows 2 fam_user records ("DVALK" and "DValk"). So, backend returns error when the check found more than 1 user with "the same" name and type.
famdb=> select * from fam_user where UPPER(user_name)='DVALK';
user_id | user_guid | cognito_user_id | user_name | create_user |
create_date | update_user | update_date | user_type_code
---------+----------------------------------+-------------------------------------------------------------------+-----------+-------------------------------------------------+---------
----------------------+---------------+------------------------+----------------
552 | FDFEE614672C4B56A3E168CD91C790F7 | prod-bceidbusiness_fdfee614672c4b56a3e168cd91c790f7@bceidbusiness | DValk | fam_proxy_api | 2023-04-
12 00:00:00+00 | fam_proxy_api | 2023-04-12 00:00:00+00 | B
551 | | | DVALK | prod-idir_7f27542b4f1a45478f90b967ff34d74b@idir | 2023-04-
12 18:07:59.618679+00 | | | B
(2 rows)
CONSTRAINT fam_usr_uk UNIQUE (user_type_code, user_name)
and this isn't case insensitive. The user with the same user_type could have same name with different case ("DVALK", "DValk").
In this case, auth-lambda could not find and update first user.
INSERT INTO app_fam.fam_user
...
...
ON CONFLICT ON CONSTRAINT fam_usr_uk DO
UPDATE SET user_guid = {user_guid}, cognito_user_id = {cognito_user_id};
Need to discuss possible solutions for the two issues on this problem:
fam_usr_uk
constraint:
Likely we need to have code fix first, then apply datafix, so when user login again the auth-function can work correctly without adding second user again.
Note, my thinking are:
cognito_user_id
column linking to current PROD Cognito user pool's user due to this pr changes requires to re-create user pool (not only user pool id change, each user's cognito_user_id
will also be different)cognito_user_id
looks something like this prod-bceidbusiness_fdfee614672c4b56a3e168cd91c790f7@bceidbusiness with random hash (appear that way to me), so it is not likely we can build a generic flyway to do migration for all users for data fix after rolling out bcsc integration pr #534 without this ticket's code fix first to prevent bad data.delete_fam_user
from old time but currently being commented out.
@router.delete("/{user_id}", response_model=schemas.FamUser)
def delete_fam_user(user_id: int, db: Session = Depends(database.get_db)):
"""
Delete a FAM user
"""
So what we need probably are: add appropriate security level check (role) and make sure it is working and leaving log trace (for now, or other consideration?); and use it for emergency 'fam_user' table (only) fix.
Just add some notes:
In the future, if we integrate with the IDP web service lookup, we could always store the idp username to prevent this issue happen again
I don't understand these comments - the user name must be identical in both scenarios (or at least matched to the same user record), or FAM won't work as intended. Scenario 1: User John Smith logs into FAM. Their IDIR is supplied as JSmith. A user record is created (if it doesn't already exist), or is updated, matching on the user id (and domain). Scenario 2: An admin grants access to user John Smith. They likely type JSMITH or jsmith for the user id. A user record is created (if it doesn't already exist), and an access assignment is created linking to the user identified by the user id (and domain)
FAM must ensure that the user record in scenario 1 and scenario 2 are the same, irrespective of the order they occur in.
Yes, so when create user through FAM UI, we call this method to check if there is an existing user, think it's case insensitive now
But when the user has been created through the auth function, as Ian mentioned above, it's checking the constraint fam_usr_uk, which is case sensitive
ALTER TABLE app_fam.fam_user ADD CONSTRAINT fam_usr_uk UNIQUE (user_type_code, user_name);
I just document how we fixed the dev deployment here, we could move it to other place later:
To remove the KMS key, we need to use the terragrunt cli:
terraform/dev/terragrunt.hcl
export tfc_workspace=sfha4x-dev
, 'sfha4x-dev' is our license plate for devterragrunt init
and terragrunt state list
to see all the stateterragrunt rm [kms_key_state]
But Basil and I both have problems run the terragrunt command, it hangs there, so we got the help from Warren Uniewski, he helped us remove that bcsc kms key from the terragrunt state. So the key is still in the AWS scheduled for deletion, but terraform will ignore it
In order to fix this problem,
Thanks all for working urgently on this, much appreciated.
Describe the bug A clear and concise description of what the bug is.
To Reproduce Steps to reproduce the behavior:
Actual Result: Application Error has Occurred
Expected Result: User role assignment has been created successfully
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Smartphone (please complete the following information):
Additional context Add any other context about the problem here.