mc2-center / csbc-pson-dcc

Data coordination resources for the NCI CSBC and PS-ON consortia
1 stars 4 forks source link

Columns in publications possibly changed that shouldn't have been #78

Closed andrewelamb closed 2 years ago

andrewelamb commented 4 years ago

The grant (grant number) and datasetId columns have some values that don't agree with what they should be. In addition, some of the grant numbers are duplicated. Unless the values in the test column are wrong this will be fixed automatically.

$value_diffs$grant
# A tibble: 25 x 3
   publicationId prod                           test                          
   <chr>         <chr>                          <chr>                         
 1 syn21649049   "[\"CA182915\"]"               "[\"CA215845\"]"              
 2 syn21645615   "[\"CA182915\", \"CA182915\"]" "[\"CA182915\", \"CA215845\"]"
 3 syn21648914   "[\"CA182915\", \"CA182915\"]" "[\"CA182915\", \"CA215845\"]"
 4 syn21649120   "[\"CA182915\"]"               "[\"CA215845\"]"              
 5 syn21645614   "[\"CA182915\", \"CA182915\"]" "[\"CA182915\", \"CA215845\"]"
 6 syn21645616   "[\"CA182915\", \"CA182915\"]" "[\"CA182915\", \"CA215845\"]"
 7 syn21681504   "[\"CA182915\"]"               "[\"CA215845\"]"              
 8 syn21648966   "[\"CA182915\", \"CA182915\"]" "[\"CA182915\", \"CA215845\"]"
 9 syn21649141   "[\"CA182915\"]"               "[\"CA215845\"]"              
10 syn21648986   "[\"CA182915\", \"CA182915\"]" "[\"CA182915\", \"CA215845\"]"
# … with 15 more rows

$value_diffs$datasetId
# A tibble: 91 x 3
   publicationId prod                     test                    
   <chr>         <chr>                    <chr>                   
 1 syn21681375   syn21792857              syn21889704, syn21889789
 2 syn21648960   syn21792798, syn21792787 syn21889551, syn21889848
 3 syn21645405   syn21790756              syn21889698, syn21889787
 4 syn21649049   syn21792696              syn21889611, syn21889887
 5 syn21645325   syn21828913              syn21889710, syn21889794
 6 syn21648894   syn21792707              syn21889514, syn21889837
 7 syn21681723   syn21812634              syn21889633, syn21889905
 8 syn21649214   syn13857535              syn21889517, syn21889840
 9 syn21681392   syn21796559              syn21889756, syn21889823
10 syn21681382   syn21796537              syn21889515, syn21889838
# … with 81 more rows
bswhite commented 4 years ago

@andrewelamb , the rows in the tables above are truncated, would you please send the whole tables?

jaeddy commented 4 years ago

@andrewelamb @bswhite This is most likely an error that was introduced when I converted the respective columns to STRING_LIST, so probably something I'll need to fix. I can grab the data from the table version prior to my update and try to fix from there.

andrewelamb commented 4 years ago

  publicationId                     prod                     test
1    syn21649049             ["CA182915"]             ["CA215845"]
2    syn21645615 ["CA182915", "CA182915"] ["CA182915", "CA215845"]
3    syn21648914 ["CA182915", "CA182915"] ["CA182915", "CA215845"]
4    syn21649120             ["CA182915"]             ["CA215845"]
5    syn21645614 ["CA182915", "CA182915"] ["CA182915", "CA215845"]
6    syn21645616 ["CA182915", "CA182915"] ["CA182915", "CA215845"]
7    syn21681504             ["CA182915"]             ["CA215845"]
8    syn21648966 ["CA182915", "CA182915"] ["CA182915", "CA215845"]
9    syn21649141             ["CA182915"]             ["CA215845"]
10   syn21648986 ["CA182915", "CA182915"] ["CA182915", "CA215845"]
11   syn21649134 ["CA182915", "CA182915"] ["CA182915", "CA215845"]
12   syn21649118             ["CA182915"]             ["CA215845"]
13   syn21681822 ["CA182915", "CA182915"] ["CA182915", "CA215845"]
14   syn21681437 ["CA182915", "CA182915"] ["CA182915", "CA215845"]
15   syn21681624 ["CA182915", "CA182915"] ["CA182915", "CA215845"]
16   syn21682010             ["CA182915"]             ["CA215845"]
17   syn21649024             ["CA182915"]             ["CA215845"]
18   syn21681551             ["CA182915"]             ["CA215845"]
19   syn21681626 ["CA182915", "CA182915"] ["CA182915", "CA215845"]
20   syn21645612 ["CA182915", "CA182915"] ["CA182915", "CA215845"]
21   syn21681574             ["CA182915"]             ["CA215845"]
22   syn21681963             ["CA182915"]             ["CA215845"]
23   syn21645617 ["CA182915", "CA182915"] ["CA182915", "CA215845"]
24   syn21645613 ["CA182915", "CA182915"] ["CA182915", "CA215845"]
25   syn21681559             ["CA182915"]             ["CA215845"]

   publicationId                                  prod                                  test
1    syn21681375                           syn21792857              syn21889704, syn21889789
2    syn21648960              syn21792798, syn21792787              syn21889551, syn21889848
3    syn21645405                           syn21790756              syn21889698, syn21889787
4    syn21649049                           syn21792696              syn21889611, syn21889887
5    syn21645325                           syn21828913              syn21889710, syn21889794
6    syn21648894                           syn21792707              syn21889514, syn21889837
7    syn21681723                           syn21812634              syn21889633, syn21889905
8    syn21649214                           syn13857535              syn21889517, syn21889840
9    syn21681392                           syn21796559              syn21889756, syn21889823
10   syn21681382                           syn21796537              syn21889515, syn21889838
11   syn21645430                           syn21790688              syn21889695, syn21889782
12   syn21648903                           syn21790882              syn21889709, syn21889793
13   syn21681737                           syn21812642              syn21889569, syn21889864
14   syn21681447                           syn21809588              syn21889519, syn21889841
15   syn21681451                           syn21809597              syn21889607, syn21889884
16   syn21681618                           syn21812565                           syn21889640
17   syn21681667                           syn21812579              syn21889605, syn21889881
18   syn21649212                           syn12976746              syn21889694, syn21889785
19   syn21681441                           syn21809465                           syn21889589
20   syn21648893                           syn21791465                           syn21889532
21   syn21645594              syn21790743, syn21889770 syn21889699, syn21889788, syn21889770
22   syn21681851                           syn21813759              syn21889622, syn21889900
23   syn21645338 syn21832111, syn21832136, syn21889771 syn21889728, syn21889803, syn21889771
24   syn21645269                           syn21791499              syn21889753, syn21889821
25   syn21681890                           syn21813804              syn21889660, syn21889917
26   syn21645337                           syn21791225              syn21889693, syn21889784
27   syn21648926                           syn21792538              syn21889718, syn21889797
28   syn21681814                           syn21813525              syn21889637, syn21889907
29   syn21681435                           syn21809435              syn21889705, syn21889790
30   syn21645258                           syn12576666              syn21889724, syn21889801
31   syn21681436                           syn21809444              syn21889565, syn21889859
32   syn21649154                           syn21797922                           syn21889580
33   syn21649001                           syn18425364              syn21889629, syn21889791
34   syn21649073                           syn21796402              syn21889765, syn21889833
35   syn21645383                           syn13858908              syn21889762, syn21889824
36   syn21681987                           syn21814183              syn21889653, syn21889914
37   syn21681523                           syn21791536              syn21889734, syn21889811
38   syn21681403                           syn21797924              syn21889545, syn21889849
39   syn21648884                           syn21791468              syn21889533, syn21889847
40   syn21648934                           syn21796552              syn21889513, syn21889836
41   syn21645422                           syn21800136              syn21889752, syn21889819
42   syn21648978                           syn21797950              syn21889731, syn21889805
43   syn21681315                           syn21790254              syn21889682, syn21889781
44   syn21649209                           syn12976694              syn21889722, syn21889800
45   syn21649081                           syn21797729              syn21889566, syn21889860
46   syn21681604                           syn21812500              syn21889603, syn21889879
47   syn21645389                           syn21790831              syn21889708, syn21889792
48   syn21649021                           syn21811019              syn21889606, syn21889883
49   syn21649207                           syn12976504              syn21889735, syn21889812
50   syn21681517                           syn21811363              syn21889575, syn21889867
51   syn21681401                           syn21797856              syn21889549, syn21889851
52   syn21681899                           syn21814010              syn21889619, syn21889897
53   syn21681801                           syn21812998              syn21889620, syn21889895
54   syn21648867                           syn12976490                           syn21889746
55   syn21645600                           syn12976704              syn21889690, syn21889783
56   syn21681783                           syn21812940              syn21889638, syn21889908
57   syn21645423                           syn12976729              syn21889730, syn21889804
58   syn21681530                           syn21811649              syn21889614, syn21889889
59   syn21649211                           syn12976725                           syn21889744
60   syn21645322                           syn21791111              syn21889747, syn21889816
61   syn21645592                           syn21800130              syn21889697, syn21889786
62   syn21681547                           syn21812436              syn21889628, syn21889902
63   syn21649183                           syn21809528              syn21889604, syn21889880
64   syn21645570                           syn21790617                           syn21889679
65   syn21649039                           syn21809454              syn21889590, syn21889856
66   syn21648897                           syn21791720              syn21889764, syn21889829
67   syn21681394                           syn21797667              syn21889567, syn21889861
68   syn21645339                           syn21790813              syn21889725, syn21889802
69   syn21681501                           syn21811287              syn21889581, syn21889870
70   syn21645574                           syn21789818              syn21889678, syn21889778
71   syn21645266                            syn9630031                           syn21889740
72   syn21681739                           syn21812727              syn21889627, syn21889901
73   syn21645255                           syn21791317              syn21889767, syn21889835
74   syn21681865                           syn21813770              syn21889617, syn21889890
75   syn21681999                           syn21814484              syn21889644, syn21889912
76   syn21645558                           syn12685522              syn21889738, syn21889814
77   syn21648977                           syn11342978              syn21889520, syn21889842
78   syn21681529                           syn21811595              syn21889613, syn21889888
79   syn21681993                           syn21814211              syn21889574, syn21889865
80   syn21681772                           syn21812746              syn21889562, syn21889858
81   syn21681378                           syn21796439                           syn21889578
82   syn21681442                           syn21809478              syn21889576, syn21889868
83   syn21649210                           syn12976723              syn21889755, syn21889822
84   syn21681326                           syn21791182              syn21889766, syn21889834
85   syn21681412                           syn21797976              syn21889579, syn21889869
86   syn21681486                           syn21811223              syn21889683, syn21889777
87   syn21649111                           syn21790751              syn21889711, syn21889796
88   syn21645319                           syn21790851              syn21889733, syn21889810
89   syn21648885                           syn21792744              syn21889524, syn21889844
90   syn21649000                           syn21791250                           syn21889719
91   syn21649213                           syn12976748                           syn21889712
bswhite commented 4 years ago

OK @jaeddy ... I notice that all of the publicationIds in the first table include the grant "CA215845" either alone or in combination with "CA182915".

CA215845 is From Mechanism to Population: Modeling HPV-related Oropharyngeal Carcinogenesis and should have been dropped.

CA182915 is itself problematic, as there were two grants associated with it -- one we agreed to drop.

jaeddy commented 4 years ago

Thanks @bswhite — that matches what I'm seeing in the table. Not sure what happened (or why it only affected those 2 grants), but should be straightforward enough to fix.

I believe that the dataset IDs are all the result of manual curation at this point? At least, the prod versions seem to be in the merged publications table dating at least as far back as 5/14. @vpchung / @jaybee84 were these values part of your recent sweep?

jaybee84 commented 4 years ago

This dataset under CA215845 grant was annotated by me, but it seems like the publications were already annotated for this dataset before I started working on this. Don't know if that is helpful to know, but wanted to mention it if that helps tracking down the timeline of the problem.

vpchung commented 4 years ago

Sorry for coming to the convo late!

I don't recognize either of the grant numbers on my end, but I can look in more detail tomorrow morning.

andrewelamb commented 4 years ago

@jaeddy @bswhite

It looks like James made some changes to the merged tables last night and the grant number is fixed. The dataset ids are still different:

publication_differences$value_diffs$datasetId %>%  data.frame()
   publicationId                                  prod                                  test
1    syn21681375                           syn21792857              syn21889704, syn21889789
2    syn21648960              syn21792798, syn21792787              syn21889551, syn21889848
3    syn21645405                           syn21790756              syn21889698, syn21889787
4    syn21649049                           syn21792696              syn21889611, syn21889887
5    syn21645325                           syn21828913              syn21889710, syn21889794
6    syn21648894                           syn21792707              syn21889514, syn21889837
7    syn21681723                           syn21812634              syn21889633, syn21889905
8    syn21649214                           syn13857535              syn21889517, syn21889840
9    syn21681392                           syn21796559              syn21889756, syn21889823
10   syn21681382                           syn21796537              syn21889515, syn21889838
11   syn21645430                           syn21790688              syn21889695, syn21889782
12   syn21648903                           syn21790882              syn21889709, syn21889793
13   syn21681737                           syn21812642              syn21889569, syn21889864
14   syn21681447                           syn21809588              syn21889519, syn21889841
15   syn21681451                           syn21809597              syn21889607, syn21889884
16   syn21681618                           syn21812565                           syn21889640
17   syn21681667                           syn21812579              syn21889605, syn21889881
18   syn21649212                           syn12976746              syn21889694, syn21889785
19   syn21681441                           syn21809465                           syn21889589
20   syn21648893                           syn21791465                           syn21889532
21   syn21645594              syn21790743, syn21889770 syn21889699, syn21889788, syn21889770
22   syn21681851                           syn21813759              syn21889622, syn21889900
23   syn21645338 syn21832111, syn21832136, syn21889771 syn21889728, syn21889803, syn21889771
24   syn21645269                           syn21791499              syn21889753, syn21889821
25   syn21681890                           syn21813804              syn21889660, syn21889917
26   syn21645337                           syn21791225              syn21889693, syn21889784
27   syn21648926                           syn21792538              syn21889718, syn21889797
28   syn21681814                           syn21813525              syn21889637, syn21889907
29   syn21681435                           syn21809435              syn21889705, syn21889790
30   syn21645258                           syn12576666              syn21889724, syn21889801
31   syn21681436                           syn21809444              syn21889565, syn21889859
32   syn21649154                           syn21797922                           syn21889580
33   syn21649001                           syn18425364              syn21889629, syn21889791
34   syn21649073                           syn21796402              syn21889765, syn21889833
35   syn21645383                           syn13858908              syn21889762, syn21889824
36   syn21681987                           syn21814183              syn21889653, syn21889914
37   syn21681523                           syn21791536              syn21889734, syn21889811
38   syn21681403                           syn21797924              syn21889545, syn21889849
39   syn21648884                           syn21791468              syn21889533, syn21889847
40   syn21648934                           syn21796552              syn21889513, syn21889836
41   syn21645422                           syn21800136              syn21889752, syn21889819
42   syn21648978                           syn21797950              syn21889731, syn21889805
43   syn21681315                           syn21790254              syn21889682, syn21889781
44   syn21649209                           syn12976694              syn21889722, syn21889800
45   syn21649081                           syn21797729              syn21889566, syn21889860
46   syn21681604                           syn21812500              syn21889603, syn21889879
47   syn21645389                           syn21790831              syn21889708, syn21889792
48   syn21649021                           syn21811019              syn21889606, syn21889883
49   syn21649207                           syn12976504              syn21889735, syn21889812
50   syn21681517                           syn21811363              syn21889575, syn21889867
51   syn21681401                           syn21797856              syn21889549, syn21889851
52   syn21681899                           syn21814010              syn21889619, syn21889897
53   syn21681801                           syn21812998              syn21889620, syn21889895
54   syn21648867                           syn12976490                           syn21889746
55   syn21645600                           syn12976704              syn21889690, syn21889783
56   syn21681783                           syn21812940              syn21889638, syn21889908
57   syn21645423                           syn12976729              syn21889730, syn21889804
58   syn21681530                           syn21811649              syn21889614, syn21889889
59   syn21649211                           syn12976725                           syn21889744
60   syn21645322                           syn21791111              syn21889747, syn21889816
61   syn21645592                           syn21800130              syn21889697, syn21889786
62   syn21681547                           syn21812436              syn21889628, syn21889902
63   syn21649183                           syn21809528              syn21889604, syn21889880
64   syn21645570                           syn21790617                           syn21889679
65   syn21649039                           syn21809454              syn21889590, syn21889856
66   syn21648897                           syn21791720              syn21889764, syn21889829
67   syn21681394                           syn21797667              syn21889567, syn21889861
68   syn21645339                           syn21790813              syn21889725, syn21889802
69   syn21681501                           syn21811287              syn21889581, syn21889870
70   syn21645574                           syn21789818              syn21889678, syn21889778
71   syn21645266                            syn9630031                           syn21889740
72   syn21681739                           syn21812727              syn21889627, syn21889901
73   syn21645255                           syn21791317              syn21889767, syn21889835
74   syn21681865                           syn21813770              syn21889617, syn21889890
75   syn21681999                           syn21814484              syn21889644, syn21889912
76   syn21645558                           syn12685522              syn21889738, syn21889814
77   syn21648977                           syn11342978              syn21889520, syn21889842
78   syn21681529                           syn21811595              syn21889613, syn21889888
79   syn21681993                           syn21814211              syn21889574, syn21889865
80   syn21681772                           syn21812746              syn21889562, syn21889858
81   syn21681378                           syn21796439                           syn21889578
82   syn21681442                           syn21809478              syn21889576, syn21889868
83   syn21649210                           syn12976723              syn21889755, syn21889822
84   syn21681326                           syn21791182              syn21889766, syn21889834
85   syn21681412                           syn21797976              syn21889579, syn21889869
86   syn21681486                           syn21811223              syn21889683, syn21889777
87   syn21649111                           syn21790751              syn21889711, syn21889796
88   syn21645319                           syn21790851              syn21889733, syn21889810
89   syn21648885                           syn21792744              syn21889524, syn21889844
90   syn21649000                           syn21791250                           syn21889719
91   syn21649213                           syn12976748                           syn21889712
andrewelamb commented 4 years ago

@bswhite @jaeddy

After fixing the grant issue and deleting/fixing the publications I'm now seeing more descrepancies. These test values are correct after taking in account changes made to the merged grants table.

> publication_differences$value_diffs$themeId %>% data.frame()
   publicationId                                                            prod                                                            test
1    syn21645615                                                            <NA>                                        syn21630076, syn21630075
2    syn21648914                                                            <NA>                                        syn21630076, syn21630075
3    syn21645614                                                            <NA>                                        syn21630076, syn21630075
4    syn21645616                                                            <NA>                                        syn21630076, syn21630075
5    syn21648966                                                            <NA>                                        syn21630076, syn21630075
6    syn21645366 syn21630081, syn21630075, syn21630076, syn21630078, syn21630077 syn21630075, syn21630081, syn21630076, syn21630078, syn21630077
7    syn21648986                                                            <NA>                                        syn21630076, syn21630075
8    syn21645584                                        syn21630081, syn21630075                                        syn21630075, syn21630081
9    syn21649001              syn21630081, syn21630075, syn21630076, syn21630078              syn21630075, syn21630081, syn21630076, syn21630078
10   syn21649134                                                            <NA>                                        syn21630076, syn21630075
11   syn21681822                                                            <NA>                                        syn21630076, syn21630075
12   syn21681437                                                            <NA>                                        syn21630076, syn21630075
13   syn21681624                                                            <NA>                                        syn21630076, syn21630075
14   syn21648921                                        syn21630081, syn21630075                                        syn21630075, syn21630081
15   syn21681626                                                            <NA>                                        syn21630076, syn21630075
16   syn21645612                                                            <NA>                                        syn21630076, syn21630075
17   syn21645585                                        syn21630081, syn21630075                                        syn21630075, syn21630081
18   syn21645586                                        syn21630081, syn21630075                                        syn21630075, syn21630081
19   syn21645617                                                            <NA>                                        syn21630076, syn21630075
20   syn21645613                                                            <NA>                                        syn21630076, syn21630075
21   syn21645583                                        syn21630081, syn21630075                                        syn21630075, syn21630081

> publication_differences$value_diffs$theme %>% data.frame()
   publicationId                                                                                              prod
1    syn21645615                                                                                                []
2    syn21648914                                                                                                []
3    syn21645614                                                                                                []
4    syn21645616                                                                                                []
5    syn21648966                                                                                                []
6    syn21645366 ["Evolution", "Heterogeneity", "Drug Resistance/Sensitivity", "Microenvironment", "Tumor-Immune"]
7    syn21648986                                                                                                []
8    syn21645584                                                                    ["Evolution", "Heterogeneity"]
9    syn21649001                 ["Evolution", "Heterogeneity", "Drug Resistance/Sensitivity", "Microenvironment"]
10   syn21649134                                                                                                []
11   syn21681822                                                                                                []
12   syn21681437                                                                                                []
13   syn21681624                                                                                                []
14   syn21648921                                                                    ["Evolution", "Heterogeneity"]
15   syn21681626                                                                                                []
16   syn21645612                                                                                                []
17   syn21645585                                                                    ["Evolution", "Heterogeneity"]
18   syn21645586                                                                    ["Evolution", "Heterogeneity"]
19   syn21645617                                                                                                []
20   syn21645613                                                                                                []
21   syn21645583                                                                    ["Evolution", "Heterogeneity"]
                                                                                                test
1                                                   ["Drug Resistance/Sensitivity", "Heterogeneity"]
2                                                   ["Drug Resistance/Sensitivity", "Heterogeneity"]
3                                                   ["Drug Resistance/Sensitivity", "Heterogeneity"]
4                                                   ["Drug Resistance/Sensitivity", "Heterogeneity"]
5                                                   ["Drug Resistance/Sensitivity", "Heterogeneity"]
6  ["Heterogeneity", "Evolution", "Drug Resistance/Sensitivity", "Microenvironment", "Tumor-Immune"]
7                                                   ["Drug Resistance/Sensitivity", "Heterogeneity"]
8                                                                     ["Heterogeneity", "Evolution"]
9                  ["Heterogeneity", "Evolution", "Drug Resistance/Sensitivity", "Microenvironment"]
10                                                  ["Drug Resistance/Sensitivity", "Heterogeneity"]
11                                                  ["Drug Resistance/Sensitivity", "Heterogeneity"]
12                                                  ["Drug Resistance/Sensitivity", "Heterogeneity"]
13                                                  ["Drug Resistance/Sensitivity", "Heterogeneity"]
14                                                                    ["Heterogeneity", "Evolution"]
15                                                  ["Drug Resistance/Sensitivity", "Heterogeneity"]
16                                                  ["Drug Resistance/Sensitivity", "Heterogeneity"]
17                                                                    ["Heterogeneity", "Evolution"]
18                                                                    ["Heterogeneity", "Evolution"]
19                                                  ["Drug Resistance/Sensitivity", "Heterogeneity"]
20                                                  ["Drug Resistance/Sensitivity", "Heterogeneity"]
21                                                                    ["Heterogeneity", "Evolution"]
andrewelamb commented 4 years ago

@jaeddy @bswhite

the grant number CA215845 popped up again:

> publication_differences$value_diffs$grant
# A tibble: 14 x 3
   publicationId prod                           test            
   <chr>         <chr>                          <chr>           
 1 syn21645615   "[\"CA182915\", \"CA215845\"]" "[\"CA215845\"]"
 2 syn21648914   "[\"CA182915\", \"CA215845\"]" "[\"CA215845\"]"
 3 syn21645614   "[\"CA182915\", \"CA215845\"]" "[\"CA215845\"]"
 4 syn21645616   "[\"CA182915\", \"CA215845\"]" "[\"CA215845\"]"
 5 syn21648966   "[\"CA182915\", \"CA215845\"]" "[\"CA215845\"]"
 6 syn21648986   "[\"CA182915\", \"CA215845\"]" "[\"CA215845\"]"
 7 syn21649134   "[\"CA182915\", \"CA215845\"]" "[\"CA215845\"]"
 8 syn21681822   "[\"CA182915\", \"CA215845\"]" "[\"CA215845\"]"
 9 syn21681437   "[\"CA182915\", \"CA215845\"]" "[\"CA215845\"]"
10 syn21681624   "[\"CA182915\", \"CA215845\"]" "[\"CA215845\"]"
11 syn21681626   "[\"CA182915\", \"CA215845\"]" "[\"CA215845\"]"
12 syn21645612   "[\"CA182915\", \"CA215845\"]" "[\"CA215845\"]"
13 syn21645617   "[\"CA182915\", \"CA215845\"]" "[\"CA215845\"]"
14 syn21645613   "[\"CA182915\", \"CA215845\"]" "[\"CA215845\"]"
> publication_differences$value_diffs$grantName %>% as.data.frame()
   publicationId                                                                                                                                    prod
1    syn21645615 ["From Mechanism to Population - Modeling HPV-related Oropharyngeal Carcinogenesis", "Phenotype Transitions in Small Cell Lung Cancer"]
2    syn21648914 ["From Mechanism to Population - Modeling HPV-related Oropharyngeal Carcinogenesis", "Phenotype Transitions in Small Cell Lung Cancer"]
3    syn21645614 ["From Mechanism to Population - Modeling HPV-related Oropharyngeal Carcinogenesis", "Phenotype Transitions in Small Cell Lung Cancer"]
4    syn21645616 ["From Mechanism to Population - Modeling HPV-related Oropharyngeal Carcinogenesis", "Phenotype Transitions in Small Cell Lung Cancer"]
5    syn21648966 ["From Mechanism to Population - Modeling HPV-related Oropharyngeal Carcinogenesis", "Phenotype Transitions in Small Cell Lung Cancer"]
6    syn21648986 ["From Mechanism to Population - Modeling HPV-related Oropharyngeal Carcinogenesis", "Phenotype Transitions in Small Cell Lung Cancer"]
7    syn21649134 ["From Mechanism to Population - Modeling HPV-related Oropharyngeal Carcinogenesis", "Phenotype Transitions in Small Cell Lung Cancer"]
8    syn21681822 ["From Mechanism to Population - Modeling HPV-related Oropharyngeal Carcinogenesis", "Phenotype Transitions in Small Cell Lung Cancer"]
9    syn21681437 ["From Mechanism to Population - Modeling HPV-related Oropharyngeal Carcinogenesis", "Phenotype Transitions in Small Cell Lung Cancer"]
10   syn21681624 ["From Mechanism to Population - Modeling HPV-related Oropharyngeal Carcinogenesis", "Phenotype Transitions in Small Cell Lung Cancer"]
11   syn21681626 ["From Mechanism to Population - Modeling HPV-related Oropharyngeal Carcinogenesis", "Phenotype Transitions in Small Cell Lung Cancer"]
12   syn21645612 ["From Mechanism to Population - Modeling HPV-related Oropharyngeal Carcinogenesis", "Phenotype Transitions in Small Cell Lung Cancer"]
13   syn21645617 ["From Mechanism to Population - Modeling HPV-related Oropharyngeal Carcinogenesis", "Phenotype Transitions in Small Cell Lung Cancer"]
14   syn21645613 ["From Mechanism to Population - Modeling HPV-related Oropharyngeal Carcinogenesis", "Phenotype Transitions in Small Cell Lung Cancer"]
                                                  test
1  ["Phenotype Transitions in Small Cell Lung Cancer"]
2  ["Phenotype Transitions in Small Cell Lung Cancer"]
3  ["Phenotype Transitions in Small Cell Lung Cancer"]
4  ["Phenotype Transitions in Small Cell Lung Cancer"]
5  ["Phenotype Transitions in Small Cell Lung Cancer"]
6  ["Phenotype Transitions in Small Cell Lung Cancer"]
7  ["Phenotype Transitions in Small Cell Lung Cancer"]
8  ["Phenotype Transitions in Small Cell Lung Cancer"]
9  ["Phenotype Transitions in Small Cell Lung Cancer"]
10 ["Phenotype Transitions in Small Cell Lung Cancer"]
11 ["Phenotype Transitions in Small Cell Lung Cancer"]
12 ["Phenotype Transitions in Small Cell Lung Cancer"]
13 ["Phenotype Transitions in Small Cell Lung Cancer"]
14 ["Phenotype Transitions in Small Cell Lung Cancer"]

This seems to be coming from a change made in the merged grants

In the source grant table the grant number is CA182915. Is this the correct number?

andrewelamb commented 4 years ago

Finnaly there are some issues with some of the datasets:

$datasetId
# A tibble: 94 x 3
   publicationId prod                     test                    
   <chr>         <chr>                    <chr>                   
 1 syn21681375   syn21792857              syn21889704, syn21889789
 2 syn21648960   syn21792798, syn21792787 syn21889551, syn21889848
 3 syn21645405   syn21790756              syn21889698, syn21889787
 4 syn21649049   syn21792696              syn21889611, syn21889887
 5 syn21645325   syn21828913              syn21889710, syn21889794
 6 syn21648894   syn21792707              syn21889514, syn21889837
 7 syn21681723   syn21812634              syn21889633, syn21889905
 8 syn21649214   syn13857535              syn21889517, syn21889840
 9 syn21681392   syn21796559              syn21889756, syn21889823
10 syn21681382   syn21796537              syn21889515, syn21889838
# … with 84 more rows

$dataset
# A tibble: 135 x 3
   publicationId prod                     test 
   <chr>         <chr>                    <chr>
 1 syn21681375   PRJNA312905              NA   
 2 syn21648960   PRJNA429647, PRJNA429648 NA   
 3 syn21648963   GSE93765, SRP096964      NA   
 4 syn21645405   PRJNA312218              NA   
 5 syn21681836   GSE125609, SRP181911     NA   
 6 syn21649049   PRJNA503257              NA   
 7 syn21681409   GSE114438                NA   
 8 syn21645325   PRJNA325519              NA   
 9 syn21648894   PRJNA393881              NA   
10 syn21681723   PRJNA527110              NA   
# … with 125 more rows

I think this is more due to how we are dealing with datasets being in flux, so I'll come back to those discrepancies later.

bswhite commented 4 years ago

@andrewelamb , I think there are a few issues here. I'm just responding to CA182915 vs CA215845. Finally we have a resolution to why these two grants are always showing up together -- nice!

CA182915 is From Mechanism to Population: Modeling HPV-related Oropharyngeal Carcinogenesis It should be dropped.

CA215845 is Phenotype Transitions in Small Cell Lung Cancer. It should be kept.

This two grants are being incorrectly mixed in the source grants table, which incorrectly says that CA182915 is Phenotype Transitions in Small Cell Lung Cancer.

The merged grants table correctly links CA215845 to Phenotype Transitions in Small Cell Lung Cancer.

Can you fix that and see how that affects the other issue(s) above?

andrewelamb commented 4 years ago

@bswhite it looks like it was indeed just the publications associated with the mixed up grant numbers that were the issue. Deleting them fixed all the issues except the dataset ones above. I'm keeping this open as a reminder to fix once we have datasets sorted out.

andrewelamb commented 4 years ago

I think the issue is stems from this issue

vpchung commented 2 years ago

Closing this ticket for now, as I don't think it's related anymore.

Will re-open if necessary.