CRI-iAtlas / iatlas-data

MOVED TO GITLAB -> https://gitlab.com/cri-iatlas/iatlas-data.git
1 stars 0 forks source link

Missing tags to tags relationships #85

Closed andrewelamb closed 4 years ago

andrewelamb commented 4 years ago

This should result in all the different pcawg study types:

 paste0(
            "SELECT * FROM tags_to_tags WHERE related_tag_id IN ",
            "(SELECT id FROM tags WHERE name = 'PCAWG_Study')" 
        ) %>% 
        perform_query()

# A tibble: 0 x 2
# … with 2 variables: tag_id <int>, related_tag_id <int>

Also some dataset to tags relationships are missing:

> paste0(
+     "SELECT * FROM ",
+     "(SELECT * FROM tags_to_tags) AS ttt ",
+     "RIGHT OUTER JOIN ",
+     "(SELECT id AS parent_id, name AS parent_name FROM tags where name IN ('PCAWG', 'TCGA')) AS t ",
+     "ON ttt.related_tag_id = t.parent_id ",
+     "LEFT OUTER JOIN ",
+     "(SELECT id AS child_id, name AS child_name FROM tags) AS t2 ",
+     "ON ttt.tag_id = t2.child_id "
+     ) %>%
+     perform_query()
# A tibble: 3 x 6
  tag_id related_tag_id parent_id parent_name child_id child_name  
   <int>          <int>     <int> <chr>          <int> <chr>       
1    136            135       135 TCGA             136 TCGA_Study  
2    137            135       135 TCGA             137 TCGA_Subtype
3     NA             NA       109 PCAWG             NA NA     

Immune subtype should be related to both TCGA and PCAWG, PCAWG study should be related to PCAWG

jonryser commented 4 years ago

This should be repaired. Please pull staging.

andrewelamb commented 4 years ago

@jonryser

The pcawg relationships did make it in, but there's are a lot of relationships that are now missing:


paste0(
    "SELECT ttt.tag_id, ttt.related_tag_id, ta.name as parent, tb.name as child ",
    "FROM tags_to_tags ttt ",
    "INNER JOIN tags ta on ttt.related_tag_id = ta.id ",
    "INNER JOIN tags tb on ttt.tag_id = tb.id"
) %>%  perform_query()

# A tibble: 20 x 4
   tag_id related_tag_id parent         child         
    <int>          <int> <chr>          <chr>         
 1     20            110 PCAWG_Study    BLCA-US       
 2     27            110 PCAWG_Study    BRCA-US       
 3     28             64 Immune_Subtype C1            
 4     29             64 Immune_Subtype C2            
 5     30             64 Immune_Subtype C3            
 6     31             64 Immune_Subtype C4            
 7     33             64 Immune_Subtype C6            
 8     36            110 PCAWG_Study    CLLE-ES       
 9     40            110 PCAWG_Study    ESAD-UK       
10     64            135 TCGA           Immune_Subtype
11     64            109 PCAWG          Immune_Subtype
12     86            110 PCAWG_Study    LIRI-JP       
13     99            110 PCAWG_Study    MALY-DE       
14    102            110 PCAWG_Study    OV-AU         
15    108            110 PCAWG_Study    PACA-AU       
16    110            109 PCAWG          PCAWG_Study   
17    126            110 PCAWG_Study    RECA-EU       
18    136            135 TCGA           TCGA_Study    
19    137            135 TCGA           TCGA_Subtype  
20    146            110 PCAWG_Study    THCA-US      

Immune_Subtype C5 is missing as are all of the TCGA studies and TCGA subtypes