rki-mf1 / covsonar

A database-driven system for handling genomic sequences of SARS-CoV-2 and screening genomic profiles.
GNU General Public License v3.0
6 stars 0 forks source link

Update to [covSonar 1.1.3] - [merged] #45

Closed silenus092 closed 2 years ago

silenus092 commented 2 years ago

In GitLab by @kunaphas.kon on Jan 16, 2022, 15:35

Merges dev/tovcf -> master

In covSonar V.1.1.3

Improvement

1.Readme.md

2. Add function to update lineage information

Run update-lineage-info flag, it will download the latest version of lineages from https://github.com/cov-lineages/pango-designation/ and install in lib/lineage.all.tsv

# example command
path/to/covsonar/sonar.py update-lineage-info

3.Database

According to #21, I add the 'submission date' column to the genome table and also prot_viewand essence view *(submission_date TEXT).

To update an existing database

BEGIN;

-- 0. Upgrade scheme version
PRAGMA user_version = 4;

-- 1. Update the genome table and add index
ALTER TABLE genome ADD submission_date TEXT;
CREATE INDEX idx_meta_submission_date ON genome (submission_date);
-- 2. Remove old VIEW
DROP VIEW dna_view;
DROP VIEW essence;
DROP VIEW prot_view;
-- 3.Create new VIEW
CREATE VIEW IF NOT EXISTS essence
AS
SELECT
    accession,description,lab,source,collection,technology,platform,
    chemistry,material,ct,software,software_version,gisaid,ena,
    zip,date,submission_date,lineage,genome.seqhash,dna_profile,aa_profile,fs_profile
FROM
    genome
LEFT JOIN sequence USING (seqhash)
LEFT JOIN profile USING (seqhash);

CREATE VIEW IF NOT EXISTS dna_view
AS
SELECT
    accession,description,lab,source,collection,technology,platform,chemistry,
    material,ct,software,software_version,gisaid,ena,zip,date,submission_date,
    lineage,genome.seqhash,start,end,ref,alt
FROM
    genome
LEFT JOIN sequence USING (seqhash)
LEFT JOIN sequence2dna USING (seqhash)
LEFT JOIN dna USING (varid);

CREATE VIEW IF NOT EXISTS prot_view
AS
SELECT
    accession,description,lab,source,collection,technology,platform,chemistry,material,ct,software,
    software_version,gisaid,ena,zip,date,submission_date,lineage,
    genome.seqhash,protein,locus,start,end,ref,alt
FROM
    genome
LEFT JOIN sequence USING (seqhash)
LEFT JOIN sequence2prot USING (seqhash)
LEFT JOIN prot USING (varid);

COMMIT;

Then we can normally update/add info.

# desh file -> SUBMISSION_DATE 
# desh passed file ->  PROCESSING_DATE 
python sonar.py update  --db mydb.new.db \
--tsv desh-passed.tsv \
--fields accession=IMS_ID zip=SENDING_LAB_PC lab=SENDING_LAB  date=DATE_DRAW submission_date=PROCESSING_DATE

Note :eyes: Not sure when we test on the real database, there will be an additional issue or not; however, at least I test on my subsampling database from the real database and it works.

And now we can use the match command with the submission_date flag to define a condition. The submission date will always be included in the output result. for example;

python sonar.py match 
--db ../workdir_covsonar/mycacheV1/mydb.new.db  \
--date 2021-10-01   \
--submission_date ^2021-10-31 2021-10-20  

3.Improve error message

According to #22, I update the error message with the accession number.

4.Update test script

I update the test script to support the submission date.

New Features

1.Database upgrade assistant

In the upcoming future, if we have the new version of the database schema, we can just use this function to upgrade. We can easily put the new schema file (e.g., {version}.sql) under lib/migrate and then run the following command;

python sonar.py db-upgrade --db mydb.db 

This function will automatically update the database to the latest version.

# Example Output

Warning: Backup db file before upgrading, Press Enter to continue...

## press Enter
Current version: 3 Upgrade to: 4
Perform the Upgrade: file: mydb.db
Database now version: 4
Success: Database upgrade was successfully completed

Fix Bugs

1. #23

Fix Incorrect Date format in SQL query, now it works properly.💪

silenus092 commented 2 years ago

In GitLab by @kunaphas.kon on Jan 16, 2022, 15:35

requested review from @s.fuchs

silenus092 commented 2 years ago

In GitLab by @kunaphas.kon on Jan 16, 2022, 15:39

added 2 commits

Compare with previous version

silenus092 commented 2 years ago

In GitLab by @s.fuchs on Jan 18, 2022, 08:16

Hi Note, could you please try the update process on a copy of our productive database. If it works, I will perform the merge! Also please add a new version (PRAGMA user_version to the sqlite) that users cannot use this covSonar version with an incompatible database. Thx a lot!

silenus092 commented 2 years ago

In GitLab by @kunaphas.kon on Jan 20, 2022, 16:55

@s.fuchs , Hi I test with the copy DB, and it works, I will implement the database upgrade assistant/command to solve and allow for a smooth upgrade for this and the upcoming version.

silenus092 commented 2 years ago

In GitLab by @kunaphas.kon on Jan 21, 2022, 11:14

added 1 commit

Compare with previous version

silenus092 commented 2 years ago

In GitLab by @kunaphas.kon on Jan 22, 2022, 22:52

marked this merge request as draft

silenus092 commented 2 years ago

In GitLab by @kunaphas.kon on Jan 22, 2022, 22:54

added 1 commit

Compare with previous version

silenus092 commented 2 years ago

In GitLab by @kunaphas.kon on Jan 24, 2022, 12:06

added 1 commit

Compare with previous version

silenus092 commented 2 years ago

In GitLab by @kunaphas.kon on Jan 24, 2022, 14:18

marked this merge request as ready

silenus092 commented 2 years ago

In GitLab by @s.fuchs on Feb 4, 2022, 16:05

mentioned in commit ed2f7963f269e70ea4d3e8ac48fecc907d573697