cBioPortal / cbioportal

cBioPortal for Cancer Genomics
https://cbioportal.org
GNU Affero General Public License v3.0
578 stars 440 forks source link

Explode rows demo #10811

Open dippindots opened 1 month ago

dippindots commented 1 month ago
SELECT
    genetic_profile.genetic_profile_id AS gpid,
    genetic_entity.id AS geid,
    genetic_profile_samples.ordered_sample_list,
    val,
    sid_str,
    sid
FROM
    (
        SELECT
            genetic_profile.genetic_profile_id,
            genetic_entity.id,
            genetic_profile_samples.ordered_sample_list,
            arrayMap(x -> (x = '' ? NULL : x), splitByString(',', assumeNotNull(genetic_alteration.values))) AS val,
            arrayMap(x -> (x = '' ? NULL : x), splitByString(',', assumeNotNull(genetic_profile_samples.ordered_sample_list))) AS sid_str,
            arrayMap(x -> (x = '' ? NULL : toInt32(x)), splitByString(',', assumeNotNull(genetic_profile_samples.ordered_sample_list))) AS sid
        FROM
            genetic_profile
                JOIN genetic_profile_samples ON genetic_profile.genetic_profile_id = genetic_profile_samples.genetic_profile_id
                JOIN genetic_alteration ON genetic_profile.genetic_profile_id = genetic_alteration.genetic_profile_id
                JOIN genetic_entity ON genetic_alteration.genetic_entity_id = genetic_entity.id
     )
    ARRAY JOIN
    val,
    sid_str,
    sid

splitByString can create a cell a list of values inside a row by parsing the value from other cells ARRAY JOIN is a reserved keyword inside clickhouse for exploding rows based on array values