Closed hepcat72 closed 4 months ago
Can we get a report of the problematic records entered into an issue at https://github.com/PrincetonUniversity/tracebase-rabinowitz-data? Those issues can be sorted out while we move forward with the loader updates.
I did previously generate that list via the shell and shared over slack. It will be exactly the same as this list here. Let me see if I can find it.
OK, I created an issue: https://github.com/PrincetonUniversity/tracebase-rabinowitz-data/issues/112
I'll hold off on merging this until the branch/PR it merges into (#962) is reviewed (I know it's a big one - sorry about that).
Rebased.
Summary Change Description
This is a rewrite of #953 to fix more issues that exist than it actually fixed. #953 was written to only handle current unique constraint violations. Multiple placeholder records were still allowed to persist in the database with fake ArchiveFile records when they didn't really all need to persist. Based on what I learned during the implementation of the
MSRunsLoader
, there exist numerous placeholder records that cause no conflicts in thePeakGroup
table because if you boiled them down to 1 placeholder, they had no duplicate compounds amongst the merged PeakGroups. So this rewrite identifies the precise conflicting peak groups, and when there are no conflicts, it just straight-up merges them. When there are conflicts, not only does it create the fakeArchiveFile
records for the mzXML files, it also includes the duplicate peak group names in the fake record that the researcher must resolve.The migration contains a print that explains what resulted from the migration. I ran it on a complete copy of production data in my sandbox and these are the stats it reported (note, I changed the prints slightly after I ran the migration, so when we run this on the actual database, it will be a little different):
Affected Issues/Pull Requests
Review Notes
See comments in-line.
Checklist
This pull request will be merged once the following requirements are met. The author and/or reviewers should uncheck any unmet requirements:
changelog.md
(or no change)