Open deflaux opened 10 years ago
For example compare:
SELECT
contig,
position,
reference_bases,
GROUP_CONCAT(alternate_bases) WITHIN RECORD AS alt,
vt,
END,
super_population,
IF(af >= 0.05,
TRUE,
FALSE) AS is_common_variant,
IF(genotype.first_allele > 0
OR genotype.second_allele > 0,
1,
0) AS has_variant
FROM
FLATTEN([google.com:biggene:1000genomes.variants1kG],
genotype) AS samples
JOIN
[google.com:biggene:1000genomes.sample_info] p
ON
samples.genotype.sample_id = p.sample
WHERE
contig = '17'
AND position BETWEEN 41196312
AND 41277500
to a better version of the same thing - notice that the inner most query is now returning a lot less data to the outer query:
SELECT
contig,
position,
reference_bases,
alt,
vt,
END,
super_population,
is_common_variant,
has_variant
FROM
FLATTEN(
(
SELECT
contig,
position,
reference_bases,
GROUP_CONCAT(alternate_bases) WITHIN RECORD AS alt,
vt,
END,
IF(af >= 0.05,
TRUE,
FALSE) AS is_common_variant,
genotype.sample_id,
IF(genotype.first_allele > 0
OR genotype.second_allele > 0,
1,
0) AS has_variant
FROM
[google.com:biggene:1000genomes.variants1kG]
WHERE
contig = '17'
AND position BETWEEN 41196312
AND 41277500),
genotype) AS samples
JOIN
[google.com:biggene:1000genomes.sample_info] p
ON
samples.genotype.sample_id = p.sample
IF(af >= 0.05, TRUE, FALSE) AS is_common_variant
same as
af >= 0.05 AS is_common_variant
?
agreed, can be simplified
I'm pretty sure that pattern is in a few other places too, filed #14
please point out other things like this! Thanks Felipe
Restructure them so that the clause within the FLATTEN returns only the columns needed and also applies any applicable WHERE clauses.