Samarth-HP / .github

0 stars 1 forks source link

[Deployment] Transfer selective data to govt. server for audit readiness #9

Open aakashyadav-kgp opened 1 year ago

aakashyadav-kgp commented 1 year ago

Description: As chakshu suggested we can selectively transfer data to govt server for audit readiness. We want to identify and transfer required data to govt server accordingly.

Acceptance Criteria:

  1. All services are audit ready. [ @karntrehan see if you can add more concrete things to acceptance criteria after talking with chakshu]

Linked to #45

karntrehan commented 1 year ago

The audit server is only 200 GB whereas our data is currently 180 GB. We need to figure a way to add only limited data for auditing I and @choxx discussed on this today.

Approach 1: Manually pull data from tables and push into the audit db. This is a very manual process where relations between the tables could break the insertions.

Approach 2: Identify most used tables using pg_stat and use audit logger to get all changes to those tables. Audit logged would give us all the changes to a table and we would insert only those changes. Wherever things break we would need to manually add the data.

Approach 3: Skip user generated submissions, assessments and attendance data. The user generated content is the highest contributor to size. We can skip the entire table or use only the last 1000 entries for each table.

Given the pros and cons we are suggesting approach 3. We will take a dump of the entire data in the evening. We shall host this dump on a temporary server and clean it up - Remove tables not needed. Take a dump of the cleaned up data and host onto audit server. We can start tomorrow post confirmation of approach with team. We shall ask a WFC talent to work on this cleanup and hosting to audit server.

cc @choxx

aakashyadav-kgp commented 1 year ago

@ChakshuGautam please review this once for audit readiness.

karntrehan commented 1 year ago

Going ahead with approach 3. Taking dump of data today evening. Monday dump data to a server (@karntrehan) to help creating one. Tuesday Ansh to be aligned to work on this.

choxx commented 1 year ago

Back up done.

root@samarthDb:~# cd /mnt/psql_data/
root@samarthDb:/mnt/psql_data# docker ps | grep postgre
5c965f6a0abf        postgres:9.6                                    "docker-entrypoint.s…"   2 years ago         Up 14 months        0.0.0.0:5432->5432/tcp                     postgres50
root@samarthDb:/mnt/psql_data# docker exec postgres50 pg_dump -U postgres -d postgres > backup_2023_03_06_18_00_00.sql
root@samarthDb:/mnt/psql_data# df -h backup_2023_03_06_18_00_00.sql 
Filesystem      Size  Used Avail Use% Mounted on
/dev/sde        496G  216G  259G  46% /mnt/psql_data
root@samarthDb:/mnt/psql_data# du -hs backup_2023_03_06_18_00_00.sql 
33G backup_2023_03_06_18_00_00.sql
root@samarthDb:/mnt/psql_data# 
choxx commented 1 year ago

Taken a dump[5 GB]. Currently on prod server. Next Step:

Below listed tables are all related to submissions/collecting data from app:

All Materialized views can be cleared entirely.

choxx commented 1 year ago

DB import is still in progress (since last 12 hours). Waiting for it to get imported on demo server & then proceed with the cleanup step.

choxx commented 1 year ago

Restoring dump is still in progress; kinda in hanged state. Process to be restarted again by increasing allocated memory resources for Postgres container.

choxx commented 1 year ago

Steps 1,2 & 3 are done. Ansh to proceed with the clean-up steps. Pinged him on Discord.

choxx commented 1 year ago

Ansh to start working on it today.

aakashyadav-kgp commented 1 year ago

@choxx to pick this up today.

choxx commented 1 year ago

Aligned with @yuvrajsab to pick up today.

yuvrajsab commented 1 year ago

dump has been taken and shared with @tushar5526

charanpreet-s commented 1 year ago

Blocked till access is provided

choxx commented 1 year ago

@tushar5526 the files have been transferred to their windows server at location C:\samagra files\latest. Please go ahead with the next steps.

karntrehan commented 1 year ago

Data moved by @tushar5526 to govt servers. To continue with deployment of services now.