malariagen / fits

File tracking system for group DK
0 stars 0 forks source link

Use case: Create a detailed BAM/CRAM manifest file with metadata, based on species common names #30

Open magnusmanske opened 6 years ago

magnusmanske commented 6 years ago

Possible with current version; example for Plasmodium vivax V4 (this included a lot of filtering of unwanted files/samples):

SELECT file.id,file.full_path,
(SELECT group_concat(value) FROM vw_sample_tag,vw_sample_file WHERE file.id=file_id AND vw_sample_file.sample_id=vw_sample_tag.sample_id AND tag_id=3561) AS ox_code,
(SELECT group_concat(value) FROM vw_file_tag WHERE file.id=file_id AND tag_id=3591) AS common_name,
(SELECT group_concat(value) FROM vw_sample_tag,vw_sample_file WHERE file.id=file_id AND vw_sample_file.sample_id=vw_sample_tag.sample_id AND tag_id=3600) AS taxon_code
FROM file
WHERE storage=1
AND file.id IN (SELECT file_id FROM vw_file_tag WHERE tag_id=3591 AND value in ('Plasmodium vivax','vivax','P.vivax','Plasmodium Vvax','P. Vivax')) /*SPECIES NAMES*/
AND full_path NOT LIKE "%_human%" AND full_path NOT LIKE "%_phix%" /*NOT HUMAN OR PHIX*/
AND file.id NOT IN (SELECT parent FROM file_relation WHERE relation=3595) /*NOT IDENTICAL TO OTHER FILE*/
AND EXISTS (SELECT * FROM vw_file_tag WHERE file.id=vw_file_tag.file_id AND tag_id=3576 AND value IN ('bam','cram')) /*FILE TYPE*/
AND EXISTS (SELECT * FROM vw_file_tag WHERE file.id=vw_file_tag.file_id AND tag_id=3581 AND value=1) /*MANUAL QC*/
AND NOT EXISTS (SELECT * FROM vw_file_tag WHERE file.id=vw_file_tag.file_id AND tag_id=3582 AND value=1) /*NO R&D*/