sul-dlss / FOLIO-Project-Stanford

Task management for Stanford’s analysis of FOLIO.
2 stars 0 forks source link

Poppy Upgrade - Load data import profiles to test from prod #605

Closed ahafele closed 4 months ago

ahafele commented 5 months ago

Users will manually relink

shelleydoljack commented 5 months ago

So interestingly, the tests we have for checking the profiles against the json schemas, one test for checking the mapping profiles is failing validating against the schema. AFAICT it is because some of them have no "value" field under the fields "incomingMatchExpression" "fields", such as:

"matchDetails": [
        {
          "incomingRecordType": "MARC_BIBLIOGRAPHIC",
          "existingRecordType": "INSTANCE",
          "incomingMatchExpression": {
            "dataValueType": "VALUE_FROM_RECORD",
            "fields": [
              {
                "label": "field",
                "value": "001"
              },
              {
                "label": "indicator1",
                "value": ""
              },
              {
                "label": "indicator2",
                "value": ""
              },
              {
                "label": "recordSubfield"
              }
            ]
          },

note: the last in the fields list just has a "label" field. I tried updating the json schemas but there is only one recent update under the part of the schemas that comes into play when validating: the fieldDefinition.json file under schemas/mod-data-import-converter-storage/match-profile-detail

I'm going to skip the test and forge ahead, hoping the profile loads into folio-test.

shelleydoljack commented 5 months ago

This has turned into a whole can of worms. Testing out the code I wrote to delete profiles, I keep experiencing weirdness on dev. I get http response code 500. I wonder if the data is just really messed up (since we never refreshed it from prod).

shelleydoljack commented 5 months ago

As I suspected, in Poppy there is a new table sul_mod_di_converter_storage.profile_wrappers that essentially keeps track of the linkages between the profiles. Here's a snippet:

                  id                  |  profile_type   |          action_profile_id           |           match_profile_id           |
          mapping_profile_id          |            job_profile_id            
--------------------------------------+-----------------+--------------------------------------+--------------------------------------+
--------------------------------------+--------------------------------------
 3cf75bd9-1756-4b6e-ae8c-3ab7c54d4a88 | JOB_PROFILE     |                                      |                                      |
                                      | 22fafcc3-f582-493d-88b0-3c538480cd83
 d3bc458b-468f-4b3d-a88c-4d616591336f | JOB_PROFILE     |                                      |                                      |
                                      | 94b535f7-8475-4b3d-9098-a9781508f837

Indeed there is a whole db migration script to populate the table. I'll look if there is a new endpoint to delete the profiles (and foreign keys in related tables). Hopefully it's not half-baked but I won't be surprised if it is!

shelleydoljack commented 5 months ago

No new endpoints. I'm convinced that our data on dev is the cause. I used the UI on folio-test to delete a profile and didn't have a problem. Here is a screenshot of the successful DELETE request: Screenshot 2024-02-07 at 2 41 28 PM I logged into the DB and did a query for that job profile f09a77ae-254a-4b1c-b943-dd16bfbc3c41 and got data!:

                  id                  |                                                                                                                                                                                                                                         jsonb                                                                                                                                                                                                                                         |      creation_date      |              created_by              
--------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------+--------------------------------------
f09a77ae-254a-4b1c-b943-dd16bfbc3c41 | {"id": "f09a77ae-254a-4b1c-b943-dd16bfbc3c41", "name": "vct update", "hidden": false, "deleted": true, "dataType": "MARC", "metadata": {"createdDate": "2023-08-10T18:12:59.294Z", "updatedDate": "2023-08-10T18:12:59.294Z", "createdByUserId": "58d0aaf6-dcda-4d5e-92da-012e6b7dd766", "updatedByUserId": "58d0aaf6-dcda-4d5e-92da-012e6b7dd766"}, "userInfo": {"lastName": "Superuser", "userName": "libsys_admin"}, "description": "", "childProfiles": [], "parentProfiles": []} | 2023-08-10 18:12:59.294 | 58d0aaf6-dcda-4d5e-92da-012e6b7dd766

note: there is a field "deleted": true the data is still in the profile_wrappers table too:

                  id                  | profile_type | action_profile_id | match_profile_id | mapping_profile_id |            job_profile_id            
--------------------------------------+--------------+-------------------+------------------+--------------------+--------------------------------------
 75b92a24-142d-4712-989e-02cb6abb7577 | JOB_PROFILE  |                   |                  |                    | f09a77ae-254a-4b1c-b943-dd16bfbc3c41

Maybe I try to clear out the tables in dev. Or maybe we use the code that updates the profiles instead of trying to delete and reload?

shelleydoljack commented 5 months ago

I went a different route:

STAGE=prod rake pull_all_data_import_profiles_data
STAGE=uat rake data_import:update_job_profiles 
Then to get the ones that weren't in test:
STAGE=uat rake data_import:load_job_profiles

STAGE=uat rake data_import:update_action_profiles
STAGE=uat rake data_import:load_action_profiles

STAGE=uat rake data_import:update_mapping_profiles
STAGE=uat rake data_import:load_mapping_profiles

STAGE=uat rake data_import:update_match_profiles
STAGE=uat rake data_import:load_match_profiles
shelleydoljack commented 5 months ago

Let me know if I should try deleting and reloading on folio-test (haven't been able to do it successfully on dev).

ahafele commented 5 months ago

I think they are ok as-is. I've asked the Data Import group to review.