jnanaswaroop / migration_final

Enlightening healthcare since 2012
0 stars 0 forks source link

MR Migration Logs from GitLab #18

Open jnanaswaroop opened 2 weeks ago

jnanaswaroop commented 2 weeks ago

Download MR Migration Logs using below link

Download CSV Log

jnanaswaroop commented 2 weeks ago

CSV Content Preview

Title,Description,Merge Request ID,GitLab MR URL,Created At,Updated At,Closed At,Merged At,Author,Source Branch,Target Branch,State,Merged By,Assignees,Reviewers,Project ID,Closed By,Labels,Head SHA,Base SHA,Should Remove Source Branch,Approvals Before Merge,Reviewers Comments
ph_23436,"Adding an index on the `md5_key` column.
Also, we need to copy the `vendor` and `manufactuer_part_number` columns over to prod. They columns are ready, and we just need to move the values from `contract` to `staging`.",310501941,https://gitlab.com/lumere/ccx_contracts/-/merge_requests/42,2024-06-24T14:10:41.917Z,2024-06-24T16:47:52.099Z,,2024-06-24T16:47:51.266Z,Michael Jaskiewicz,ph_23436_index_md5,master,merged,Michael Jaskiewicz,Michael Jaskiewicz,"Nathan Kessler, May Hong",34807400,NA,NA,4df3f6295dff0a19c3926fff8d8acc8e0ce29e12,74235335b236f9008da57013fc1082e9d58a44c3,True,,Nathan Kessler: nothings jumping out at me and tests pass. lgtm
fix_migration,,308957333,https://gitlab.com/lumere/ccx_contracts/-/merge_requests/41,2024-06-17T22:34:03.966Z,2024-06-17T22:36:01.199Z,,2024-06-17T22:36:01.325Z,Michael Jaskiewicz,fix_migration,master,merged,Michael Jaskiewicz,Michael Jaskiewicz,NA,34807400,NA,NA,3a4d67c122011fe69f760566b0f35de03977bd78,4620db962335291ea6e10ad7a78387b323c5d151,True,,No comments
kaniko,,308313837,https://gitlab.com/lumere/ccx_contracts/-/merge_requests/40,2024-06-13T20:56:42.020Z,2024-06-13T20:58:36.699Z,,2024-06-13T20:58:36.731Z,Michael Jaskiewicz,kaniko,master,merged,Michael Jaskiewicz,Michael Jaskiewicz,NA,34807400,NA,NA,527f14fe356f60a67daa3b4c49560fd042cc23b2,060705c72d0bee41ac6fe0e45ec04f51016c8daf,True,,No comments
ph_23196_quantity_of_eaches,"We have quantity of eaches in the raw data already. It was called ""quantity"". This ticket is about moving that data over into our contract table, staging table, and then off to prod.",308279100,https://gitlab.com/lumere/ccx_contracts/-/merge_requests/39,2024-06-13T17:48:27.588Z,2024-06-13T20:53:57.127Z,,2024-06-13T20:53:56.634Z,Michael Jaskiewicz,ph_23196_copy_qoe,master,merged,Michael Jaskiewicz,Michael Jaskiewicz,Nathan Kessler,34807400,NA,NA,57e46fc9875aa63db96e923dfee8c79c69fea04d,a7b034b30cf590c7e5fafc168859cd3d3ce5a582,True,,No comments
Ph 23410 md5 key,"To join between the data_prep.raw_ccx table, I need some kind of key since the CCX uses item_start_date, item_end_date, uom, manufacturer_part_number, contract_number, organization_eid, vendor. This kind of key is slow to join against so I want to add an MD5 hashed version of the columns in question.

The deployment was also broken given the switch to kaniko.",307692440,https://gitlab.com/lumere/ccx_contracts/-/merge_requests/38,2024-06-11T17:00:27.188Z,2024-06-13T19:45:36.423Z,,2024-06-13T19:45:36.478Z,Michael Jaskiewicz,ph_23410_md5_key,master,merged,Michael Jaskiewicz,Michael Jaskiewicz,"Nathan Kessler, May Hong",34807400,NA,NA,7802bab030d81f65ede42021acb0b5c0327999d9,60d913008f12521cde41f0f0b00dd87319578dfc,True,,"Nathan Kessler: is there a reason to hash the columns instead of just using a uid? it seems like the process to hash columns instead of just using the uid would be slower
Michael Jaskiewicz: The whole objective here is to join from my `contract` table to `data_prep.raw_ccx`. 

The `data_prep.raw_ccx` table brings in all the raw data rows from all the files. 
* The key for a contract line is composed of 7 columns. 
* Each line is labeled with a character identifying what that line is: insert, update, or delete. 
* A unique contract row may actually appear multiple times in the raw data because there is a history of updates included in that table. If I had a new raw update row that I wanted to label with a uid, I would have to search the table for previous instances of that key to find the uid, which I'm sure would be prohibitively slow. Hashing the key seemed to be a quick way to facilitate joining.
Nathan Kessler: gotcha -- makes sense. just curious here if nulls in any of those field will throw a wrench into this at all. if not i'll approve
Michael Jaskiewicz: Those fields should never be null. I confirmed with CCX and in the `contract` table. We do have some nulls in `data_prep.raw_ccx` but they never made it out of that table. I'm confirming w/ CCX right now what's wrong.
Michael Jaskiewicz: Confirmed with CCX. The rows with nulls are mistakes on their end and are to be ignored.
Michael Jaskiewicz: Nate, need an approval."
ph_23030_contract_price,"Ticket to start sending over contract price, landed price, uom, and quantity of eaches (placeholder for now since we don't receive it).
* Add some columns to contract table
* Add some columns to staging table
* Read new columns from data_prep.raw_ccx -> contracts -> staging
* Update gitlab ci image because something was wrong with the image pulling down k8s stuff",299166084,https://gitlab.com/lumere/ccx_contracts/-/merge_requests/37,2024-04-30T13:16:10.220Z,2024-05-15T18:06:48.088Z,,2024-05-15T18:06:46.861Z,Michael Jaskiewicz,ph_23030_contract_price,master,merged,Michael Jaskiewicz,Michael Jaskiewicz,"Nathan Kessler, May Hong",34807400,NA,NA,cdcb21f79edb786c391f272887f46ff2e24608f9,b5a9ec7feea60edfe98181f1a451e11f5c0cd6da,True,,"Nathan Kessler: any reason for 10, 2? im indifferent bc i dont think it'll ever be an issue, but i've just seen 13, 2 everywhere else in phapp
Nathan Kessler: the username prob should be an env variable as well no?
May Hong: Should these be also 13, 2?"
ph_22996,"https://procured.myjetbrains.com/youtrack/issue/ph-22996/Increase-batch-size-in-CCX-classification-and-record-classified-IDs-w-datetime

There are 2 things in this ticket. Only doing the 1st item.
1. increase the batch size when processing to 100k.",283767501,https://gitlab.com/lumere/ccx_contracts/-/merge_requests/36,2024-02-21T19:55:14.664Z,2024-02-26T16:07:28.796Z,,2024-02-26T16:07:27.688Z,Michael Jaskiewicz,ph_22996,master,merged,Michael Jaskiewicz,Michael Jaskiewicz,"Nathan Kessler, May Hong",34807400,NA,NA,b1e879f0bf82b8ed87baa94ff73bfed1a65c8e1b,6fa246f8c10674217ae087310d7b7f1f8f0a65d0,True,,No comments
ph_22991,I just realized a stupid mistake. I was doing an `update` with a `from` clause. The update table has to be joined with the results of the `from`. I used the primary key which is 6 fields long. I can just use the ID column on the contracts table instead. This tremendously improves performance.,282790178,https://gitlab.com/lumere/ccx_contracts/-/merge_requests/35,2024-02-16T20:22:18.553Z,2024-02-26T16:07:19.202Z,,2024-02-26T16:07:11.113Z,Michael Jaskiewicz,ph_22991,master,merged,Michael Jaskiewicz,Michael Jaskiewicz,"Nathan Kessler, May Hong",34807400,NA,NA,64b71ac461dcdd21c9b36eb0d8553760ecbbe2b0,6fa246f8c10674217ae087310d7b7f1f8f0a65d0,True,,No comments
ph_22989,K8s upgrade means that `v1beta` in cronjob needs to be `v1`,282735879,https://gitlab.com/lumere/ccx_contracts/-/merge_requests/34,2024-02-16T15:39:29.100Z,2024-02-16T16:54:58.051Z,,2024-02-16T16:54:57.363Z,Michael Jaskiewicz,ph_22989,master,merged,Michael Jaskiewicz,Michael Jaskiewicz,"Nathan Kessler, May Hong",34807400,NA,NA,e9e160ef69f98d5ce7a49138654ff2a614304347,74e0517196b248dd2d4c5eb9986ab7af8cd5dd29,True,,No comments
ph_22953,"This ticket introduces a queue to keep track of which contract table items need to be classified next. 

Pipeline:
1. We get the file in data_prep.raw_ccx
2. The function that adds the new data to the `contract` table adds the ID to a `pending_classification` table. 
3. Classification now runs only over IDs in the `pending_classification` table so we can be sure we don't futilely attempt classification on something twice. 
4. If a poterm changes or a new one is added, next time the classifier runs, it will consider only the updated poterms and not the entire set. 

A side benefit of this is that we can track progression of classification over new files.",281587159,https://gitlab.com/lumere/ccx_contracts/-/merge_requests/33,2024-02-12T17:09:53.158Z,2024-02-15T13:24:32.468Z,,2024-02-15T13:24:31.283Z,Michael Jaskiewicz,ph_22953,master,merged,Michael Jaskiewicz,Michael Jaskiewicz,"Nathan Kessler, May Hong",34807400,NA,NA,d7daed5b9d1c6a0f54f3275f22744f916aa40bd2,35ce39a70941b34cb3be009e6eac9ab9ed603568,True,,"Nathan Kessler: just me being ignorant of sqlalchemy here, but is this saying when the Contract object is deleted, the queue record here will get deleted?
Nathan Kessler: just making sure its not the other way around
Michael Jaskiewicz: Yes, the deletion of a `contract` record cascades to `pending_classification`.
Nathan Kessler: lgtm. just a thought that as this progresses it's getting more difficult to understand at a glance with all this logic living in sql :smile:"

For a full download, click here.