Open jaclyn-taroni opened 6 years ago
I think looking at the protocol metadata is likely to be a good way to proceed. I suggest we might randomly select tens of experiments from the GEO platform GPL4133 & take a look at the metadata.
By looking at the protocols we can see the presence of both 'Cy3' and 'Cy5' strings: https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-4636/protocols/
Although it is not present in the API metadata: https://www.ebi.ac.uk/arrayexpress/json/v3/experiments/E-MTAB-4636/
A better example (previous one wasn't Agilent): https://www.ebi.ac.uk/arrayexpress/experiments/E-GEOD-77820
Still holds for Cy3/5 criteria.
Protocol has an API endpoint so we don't need to scrape: https://www.ebi.ac.uk/arrayexpress/json/v3/experiments/E-MTAB-4636/protocols
Verdict is that we should snarf basically the entire /protocols
API response because it might be scientifically useful. So, related: #96
To clarify, if we detect that a sample is Agilent 1 Color then we should just grab the user-submitted processed data, if it is 2 Color then we should grab the raw data and process it with SCAN.UPC.
Samples (14078) Series (756)
Here are all of the GSEs for GPL4133:
!Platform_series_id = GSE7701
!Platform_series_id = GSE7702
!Platform_series_id = GSE7900
!Platform_series_id = GSE7902
!Platform_series_id = GSE8353
!Platform_series_id = GSE8993
!Platform_series_id = GSE9067
!Platform_series_id = GSE9077
!Platform_series_id = GSE9187
!Platform_series_id = GSE9561
!Platform_series_id = GSE9869
!Platform_series_id = GSE10057
!Platform_series_id = GSE10107
!Platform_series_id = GSE10164
!Platform_series_id = GSE10455
!Platform_series_id = GSE10541
!Platform_series_id = GSE10570
!Platform_series_id = GSE10613
!Platform_series_id = GSE10667
!Platform_series_id = GSE10863
!Platform_series_id = GSE10864
!Platform_series_id = GSE10868
!Platform_series_id = GSE10956
!Platform_series_id = GSE10959
!Platform_series_id = GSE11132
!Platform_series_id = GSE11173
!Platform_series_id = GSE11205
!Platform_series_id = GSE11233
!Platform_series_id = GSE11242
!Platform_series_id = GSE11682
!Platform_series_id = GSE11946
!Platform_series_id = GSE11968
!Platform_series_id = GSE11985
!Platform_series_id = GSE12075
!Platform_series_id = GSE12114
!Platform_series_id = GSE12307
!Platform_series_id = GSE12384
!Platform_series_id = GSE12385
!Platform_series_id = GSE12405
!Platform_series_id = GSE12553
!Platform_series_id = GSE12928
!Platform_series_id = GSE13216
!Platform_series_id = GSE13286
!Platform_series_id = GSE13334
!Platform_series_id = GSE13365
!Platform_series_id = GSE13407
!Platform_series_id = GSE13470
!Platform_series_id = GSE13566
!Platform_series_id = GSE13834
!Platform_series_id = GSE13886
!Platform_series_id = GSE13919
!Platform_series_id = GSE14028
!Platform_series_id = GSE14048
!Platform_series_id = GSE14097
!Platform_series_id = GSE14261
!Platform_series_id = GSE14312
!Platform_series_id = GSE14409
!Platform_series_id = GSE14476
!Platform_series_id = GSE14490
!Platform_series_id = GSE14560
!Platform_series_id = GSE14617
!Platform_series_id = GSE14681
!Platform_series_id = GSE14839
!Platform_series_id = GSE14853
!Platform_series_id = GSE14910
!Platform_series_id = GSE14972
!Platform_series_id = GSE14982
!Platform_series_id = GSE15075
!Platform_series_id = GSE15076
!Platform_series_id = GSE15109
!Platform_series_id = GSE15112
!Platform_series_id = GSE15212
!Platform_series_id = GSE15359
!Platform_series_id = GSE15549
!Platform_series_id = GSE15576
!Platform_series_id = GSE15812
!Platform_series_id = GSE15948
!Platform_series_id = GSE16026
!Platform_series_id = GSE16053
!Platform_series_id = GSE16065
!Platform_series_id = GSE16113
!Platform_series_id = GSE16123
!Platform_series_id = GSE16358
!Platform_series_id = GSE16532
!Platform_series_id = GSE16641
!Platform_series_id = GSE16727
!Platform_series_id = GSE16872
!Platform_series_id = GSE16945
!Platform_series_id = GSE16957
!Platform_series_id = GSE17018
!Platform_series_id = GSE17311
!Platform_series_id = GSE17403
!Platform_series_id = GSE17594
!Platform_series_id = GSE17623
!Platform_series_id = GSE17630
!Platform_series_id = GSE17632
!Platform_series_id = GSE17753
!Platform_series_id = GSE17766
!Platform_series_id = GSE17839
!Platform_series_id = GSE17842
!Platform_series_id = GSE17843
!Platform_series_id = GSE17860
!Platform_series_id = GSE17924
!Platform_series_id = GSE17992
!Platform_series_id = GSE18102
!Platform_series_id = GSE18109
!Platform_series_id = GSE18138
!Platform_series_id = GSE18316
!Platform_series_id = GSE18390
!Platform_series_id = GSE18438
!Platform_series_id = GSE18439
!Platform_series_id = GSE18457
!Platform_series_id = GSE18612
!Platform_series_id = GSE18689
!Platform_series_id = GSE18693
!Platform_series_id = GSE18817
!Platform_series_id = GSE18844
!Platform_series_id = GSE18849
!Platform_series_id = GSE18874
!Platform_series_id = GSE18875
!Platform_series_id = GSE18966
!Platform_series_id = GSE18971
!Platform_series_id = GSE19324
!Platform_series_id = GSE19362
!Platform_series_id = GSE19494
!Platform_series_id = GSE19541
!Platform_series_id = GSE19712
!Platform_series_id = GSE19716
!Platform_series_id = GSE19717
!Platform_series_id = GSE19718
!Platform_series_id = GSE19853
!Platform_series_id = GSE19939
!Platform_series_id = GSE19992
!Platform_series_id = GSE20028
!Platform_series_id = GSE20127
!Platform_series_id = GSE20147
!Platform_series_id = GSE20171
!Platform_series_id = GSE20298
!Platform_series_id = GSE20506
!Platform_series_id = GSE20680
!Platform_series_id = GSE20681
!Platform_series_id = GSE20686
!Platform_series_id = GSE20690
!Platform_series_id = GSE20750
!Platform_series_id = GSE20842
!Platform_series_id = GSE20906
!Platform_series_id = GSE20936
!Platform_series_id = GSE20937
!Platform_series_id = GSE20941
!Platform_series_id = GSE20945
!Platform_series_id = GSE20988
!Platform_series_id = GSE20993
!Platform_series_id = GSE21201
!Platform_series_id = GSE21202
!Platform_series_id = GSE21209
!Platform_series_id = GSE21280
!Platform_series_id = GSE21284
!Platform_series_id = GSE21328
!Platform_series_id = GSE21367
!Platform_series_id = GSE21501
!Platform_series_id = GSE21565
!Platform_series_id = GSE21586
!Platform_series_id = GSE21792
!Platform_series_id = GSE21886
!Platform_series_id = GSE21959
!Platform_series_id = GSE22030
!Platform_series_id = GSE22032
!Platform_series_id = GSE22085
!Platform_series_id = GSE22226
!Platform_series_id = GSE22265
!Platform_series_id = GSE22323
!Platform_series_id = GSE22384
!Platform_series_id = GSE22430
!Platform_series_id = GSE22586
!Platform_series_id = GSE22775
!Platform_series_id = GSE22778
!Platform_series_id = GSE22866
!Platform_series_id = GSE22891
!Platform_series_id = GSE22900
!Platform_series_id = GSE22901
!Platform_series_id = GSE23019
!Platform_series_id = GSE23074
!Platform_series_id = GSE23113
!Platform_series_id = GSE23131
!Platform_series_id = GSE23169
!Platform_series_id = GSE23171
!Platform_series_id = GSE23209
!Platform_series_id = GSE23363
!Platform_series_id = GSE23536
!Platform_series_id = GSE23669
!Platform_series_id = GSE23688
!Platform_series_id = GSE23689
!Platform_series_id = GSE23773
!Platform_series_id = GSE23803
!Platform_series_id = GSE23804
!Platform_series_id = GSE23807
!Platform_series_id = GSE23901
!Platform_series_id = GSE23903
!Platform_series_id = GSE23922
!Platform_series_id = GSE23989
!Platform_series_id = GSE24020
!Platform_series_id = GSE24100
!Platform_series_id = GSE24171
!Platform_series_id = GSE24231
!Platform_series_id = GSE24240
!Platform_series_id = GSE24268
!Platform_series_id = GSE24370
!Platform_series_id = GSE24432
!Platform_series_id = GSE24731
!Platform_series_id = GSE24732
!Platform_series_id = GSE24782
!Platform_series_id = GSE24876
!Platform_series_id = GSE24883
!Platform_series_id = GSE24908
!Platform_series_id = GSE24951
!Platform_series_id = GSE25167
!Platform_series_id = GSE25193
!Platform_series_id = GSE25200
!Platform_series_id = GSE25289
!Platform_series_id = GSE25346
!Platform_series_id = GSE25453
!Platform_series_id = GSE25623
!Platform_series_id = GSE25624
!Platform_series_id = GSE25844
!Platform_series_id = GSE25935
!Platform_series_id = GSE25936
!Platform_series_id = GSE26088
!Platform_series_id = GSE26089
!Platform_series_id = GSE26106
!Platform_series_id = GSE26129
!Platform_series_id = GSE26259
!Platform_series_id = GSE26321
!Platform_series_id = GSE26322
!Platform_series_id = GSE26411
!Platform_series_id = GSE26692
!Platform_series_id = GSE26721
!Platform_series_id = GSE26812
!Platform_series_id = GSE26855
!Platform_series_id = GSE26856
!Platform_series_id = GSE26857
!Platform_series_id = GSE26979
!Platform_series_id = GSE26993
!Platform_series_id = GSE26996
!Platform_series_id = GSE27173
!Platform_series_id = GSE27183
!Platform_series_id = GSE27254
!Platform_series_id = GSE27335
!Platform_series_id = GSE27503
!Platform_series_id = GSE27616
!Platform_series_id = GSE27619
!Platform_series_id = GSE27842
!Platform_series_id = GSE27900
!Platform_series_id = GSE27915
!Platform_series_id = GSE28000
!Platform_series_id = GSE28038
!Platform_series_id = GSE28045
!Platform_series_id = GSE28073
!Platform_series_id = GSE28230
!Platform_series_id = GSE28253
!Platform_series_id = GSE28300
!Platform_series_id = GSE28400
!Platform_series_id = GSE28401
!Platform_series_id = GSE28456
!Platform_series_id = GSE28478
!Platform_series_id = GSE28501
!Platform_series_id = GSE28522
!Platform_series_id = GSE28615
!Platform_series_id = GSE28623
!Platform_series_id = GSE28628
!Platform_series_id = GSE28650
!Platform_series_id = GSE28658
!Platform_series_id = GSE28748
!Platform_series_id = GSE28813
!Platform_series_id = GSE28818
!Platform_series_id = GSE28877
!Platform_series_id = GSE28907
!Platform_series_id = GSE28912
!Platform_series_id = GSE29000
!Platform_series_id = GSE29090
!Platform_series_id = GSE29141
!Platform_series_id = GSE29270
!Platform_series_id = GSE29288
!Platform_series_id = GSE29405
!Platform_series_id = GSE29507
!Platform_series_id = GSE29606
!Platform_series_id = GSE29608
!Platform_series_id = GSE29746
!Platform_series_id = GSE29760
!Platform_series_id = GSE29801
!Platform_series_id = GSE29861
!Platform_series_id = GSE29869
!Platform_series_id = GSE29886
!Platform_series_id = GSE29917
!Platform_series_id = GSE30023
!Platform_series_id = GSE30105
!Platform_series_id = GSE30107
!Platform_series_id = GSE30114
!Platform_series_id = GSE30131
!Platform_series_id = GSE30132
!Platform_series_id = GSE30171
!Platform_series_id = GSE30181
!Platform_series_id = GSE30432
!Platform_series_id = GSE30475
!Platform_series_id = GSE30592
!Platform_series_id = GSE30664
!Platform_series_id = GSE30904
!Platform_series_id = GSE30961
!Platform_series_id = GSE30994
!Platform_series_id = GSE31003
!Platform_series_id = GSE31093
!Platform_series_id = GSE31095
!Platform_series_id = GSE31147
!Platform_series_id = GSE31195
!Platform_series_id = GSE31277
!Platform_series_id = GSE31286
!Platform_series_id = GSE31322
!Platform_series_id = GSE31360
!Platform_series_id = GSE31425
!Platform_series_id = GSE31426
!Platform_series_id = GSE31427
!Platform_series_id = GSE31589
!Platform_series_id = GSE31728
!Platform_series_id = GSE31802
!Platform_series_id = GSE31904
!Platform_series_id = GSE31965
!Platform_series_id = GSE31981
!Platform_series_id = GSE32026
!Platform_series_id = GSE32143
!Platform_series_id = GSE32144
!Platform_series_id = GSE32150
!Platform_series_id = GSE32220
!Platform_series_id = GSE32221
!Platform_series_id = GSE32371
!Platform_series_id = GSE32388
!Platform_series_id = GSE32413
!Platform_series_id = GSE32441
!Platform_series_id = GSE32456
!Platform_series_id = GSE32645
!Platform_series_id = GSE32709
!Platform_series_id = GSE32915
!Platform_series_id = GSE33012
!Platform_series_id = GSE33093
!Platform_series_id = GSE33142
!Platform_series_id = GSE33224
!Platform_series_id = GSE33264
!Platform_series_id = GSE33267
!Platform_series_id = GSE33271
!Platform_series_id = GSE33272
!Platform_series_id = GSE33273
!Platform_series_id = GSE33277
!Platform_series_id = GSE33290
!Platform_series_id = GSE33526
!Platform_series_id = GSE33615
!Platform_series_id = GSE33673
!Platform_series_id = GSE33723
!Platform_series_id = GSE33731
!Platform_series_id = GSE33755
!Platform_series_id = GSE33812
!Platform_series_id = GSE33824
!Platform_series_id = GSE33910
!Platform_series_id = GSE34007
!Platform_series_id = GSE34077
!Platform_series_id = GSE34131
!Platform_series_id = GSE34153
!Platform_series_id = GSE34228
!Platform_series_id = GSE34252
!Platform_series_id = GSE34291
!Platform_series_id = GSE34303
!Platform_series_id = GSE34396
!Platform_series_id = GSE34429
!Platform_series_id = GSE34487
!Platform_series_id = GSE34499
!Platform_series_id = GSE34527
!Platform_series_id = GSE34792
!Platform_series_id = GSE34881
!Platform_series_id = GSE34940
!Platform_series_id = GSE35002
!Platform_series_id = GSE35133
!Platform_series_id = GSE35141
!Platform_series_id = GSE35142
!Platform_series_id = GSE35163
!Platform_series_id = GSE35168
!Platform_series_id = GSE35311
!Platform_series_id = GSE35313
!Platform_series_id = GSE35454
!Platform_series_id = GSE35477
!Platform_series_id = GSE35494
!Platform_series_id = GSE35500
!Platform_series_id = GSE35576
!Platform_series_id = GSE35733
!Platform_series_id = GSE35749
!Platform_series_id = GSE35753
!Platform_series_id = GSE35756
!Platform_series_id = GSE35757
!Platform_series_id = GSE35800
!Platform_series_id = GSE35814
!Platform_series_id = GSE35982
!Platform_series_id = GSE35994
!Platform_series_id = GSE36082
!Platform_series_id = GSE36207
!Platform_series_id = GSE36267
!Platform_series_id = GSE36549
!Platform_series_id = GSE36654
!Platform_series_id = GSE36758
!Platform_series_id = GSE36854
!Platform_series_id = GSE36931
!Platform_series_id = GSE37087
!Platform_series_id = GSE37110
!Platform_series_id = GSE37116
!Platform_series_id = GSE37117
!Platform_series_id = GSE37170
!Platform_series_id = GSE37257
!Platform_series_id = GSE37277
!Platform_series_id = GSE37326
!Platform_series_id = GSE37575
!Platform_series_id = GSE37738
!Platform_series_id = GSE37888
!Platform_series_id = GSE37957
!Platform_series_id = GSE38227
!Platform_series_id = GSE38241
!Platform_series_id = GSE38242
!Platform_series_id = GSE38330
!Platform_series_id = GSE38544
!Platform_series_id = GSE38581
!Platform_series_id = GSE38959
!Platform_series_id = GSE38974
!Platform_series_id = GSE39199
!Platform_series_id = GSE39200
!Platform_series_id = GSE39202
!Platform_series_id = GSE39400
!Platform_series_id = GSE39477
!Platform_series_id = GSE39493
!Platform_series_id = GSE39745
!Platform_series_id = GSE39764
!Platform_series_id = GSE39768
!Platform_series_id = GSE39847
!Platform_series_id = GSE40047
!Platform_series_id = GSE40185
!Platform_series_id = GSE40206
!Platform_series_id = GSE40315
!Platform_series_id = GSE40383
!Platform_series_id = GSE40384
!Platform_series_id = GSE40385
!Platform_series_id = GSE40386
!Platform_series_id = GSE40682
!Platform_series_id = GSE40808
!Platform_series_id = GSE41034
!Platform_series_id = GSE41110
!Platform_series_id = GSE41255
!Platform_series_id = GSE41436
!Platform_series_id = GSE41483
!Platform_series_id = GSE41502
!Platform_series_id = GSE41617
!Platform_series_id = GSE41651
!Platform_series_id = GSE41653
!Platform_series_id = GSE41744
!Platform_series_id = GSE41752
!Platform_series_id = GSE41781
!Platform_series_id = GSE42099
!Platform_series_id = GSE42256
!Platform_series_id = GSE42357
!Platform_series_id = GSE42401
!Platform_series_id = GSE42402
!Platform_series_id = GSE42520
!Platform_series_id = GSE42619
!Platform_series_id = GSE42643
!Platform_series_id = GSE42667
!Platform_series_id = GSE42668
!Platform_series_id = GSE42879
!Platform_series_id = GSE43049
!Platform_series_id = GSE43219
!Platform_series_id = GSE43467
!Platform_series_id = GSE43611
!Platform_series_id = GSE43674
!Platform_series_id = GSE43962
!Platform_series_id = GSE43973
!Platform_series_id = GSE44066
!Platform_series_id = GSE44133
!Platform_series_id = GSE44135
!Platform_series_id = GSE44290
!Platform_series_id = GSE44426
!Platform_series_id = GSE44729
!Platform_series_id = GSE44941
!Platform_series_id = GSE44987
!Platform_series_id = GSE45158
!Platform_series_id = GSE45245
!Platform_series_id = GSE45251
!Platform_series_id = GSE45340
!Platform_series_id = GSE45357
!Platform_series_id = GSE45371
!Platform_series_id = GSE45403
!Platform_series_id = GSE45404
!Platform_series_id = GSE45422
!Platform_series_id = GSE45531
!Platform_series_id = GSE45596
!Platform_series_id = GSE45763
!Platform_series_id = GSE45960
!Platform_series_id = GSE46021
!Platform_series_id = GSE46314
!Platform_series_id = GSE46408
!Platform_series_id = GSE46471
!Platform_series_id = GSE46477
!Platform_series_id = GSE46670
!Platform_series_id = GSE46973
!Platform_series_id = GSE46974
!Platform_series_id = GSE47147
!Platform_series_id = GSE47435
!Platform_series_id = GSE47511
!Platform_series_id = GSE47513
!Platform_series_id = GSE47830
!Platform_series_id = GSE48080
!Platform_series_id = GSE48132
!Platform_series_id = GSE48133
!Platform_series_id = GSE48211
!Platform_series_id = GSE48265
!Platform_series_id = GSE48384
!Platform_series_id = GSE48399
!Platform_series_id = GSE48838
!Platform_series_id = GSE48847
!Platform_series_id = GSE49175
!Platform_series_id = GSE49288
!Platform_series_id = GSE49578
!Platform_series_id = GSE49594
!Platform_series_id = GSE49657
!Platform_series_id = GSE49900
!Platform_series_id = GSE49969
!Platform_series_id = GSE49974
!Platform_series_id = GSE50395
!Platform_series_id = GSE50494
!Platform_series_id = GSE50619
!Platform_series_id = GSE50784
!Platform_series_id = GSE50911
!Platform_series_id = GSE50939
!Platform_series_id = GSE50988
!Platform_series_id = GSE51029
!Platform_series_id = GSE51059
!Platform_series_id = GSE51060
!Platform_series_id = GSE51081
!Platform_series_id = GSE51086
!Platform_series_id = GSE51087
!Platform_series_id = GSE51433
!Platform_series_id = GSE51561
!Platform_series_id = GSE51617
!Platform_series_id = GSE51622
!Platform_series_id = GSE51624
!Platform_series_id = GSE51748
!Platform_series_id = GSE51999
!Platform_series_id = GSE52061
!Platform_series_id = GSE52100
!Platform_series_id = GSE52211
!Platform_series_id = GSE52212
!Platform_series_id = GSE52292
!Platform_series_id = GSE52602
!Platform_series_id = GSE53014
!Platform_series_id = GSE53104
!Platform_series_id = GSE53175
!Platform_series_id = GSE53180
!Platform_series_id = GSE53181
!Platform_series_id = GSE53236
!Platform_series_id = GSE53270
!Platform_series_id = GSE53791
!Platform_series_id = GSE53792
!Platform_series_id = GSE53872
!Platform_series_id = GSE54033
!Platform_series_id = GSE54083
!Platform_series_id = GSE54171
!Platform_series_id = GSE54258
!Platform_series_id = GSE54635
!Platform_series_id = GSE54712
!Platform_series_id = GSE54872
!Platform_series_id = GSE54898
!Platform_series_id = GSE54981
!Platform_series_id = GSE55015
!Platform_series_id = GSE55024
!Platform_series_id = GSE55063
!Platform_series_id = GSE55064
!Platform_series_id = GSE55065
!Platform_series_id = GSE55288
!Platform_series_id = GSE55563
!Platform_series_id = GSE55565
!Platform_series_id = GSE55668
!Platform_series_id = GSE55669
!Platform_series_id = GSE55723
!Platform_series_id = GSE55787
!Platform_series_id = GSE56103
!Platform_series_id = GSE56116
!Platform_series_id = GSE56363
!Platform_series_id = GSE56519
!Platform_series_id = GSE56573
!Platform_series_id = GSE56618
!Platform_series_id = GSE56946
!Platform_series_id = GSE57259
!Platform_series_id = GSE57273
!Platform_series_id = GSE57341
!Platform_series_id = GSE57343
!Platform_series_id = GSE57473
!Platform_series_id = GSE57474
!Platform_series_id = GSE57571
!Platform_series_id = GSE57756
!Platform_series_id = GSE57825
!Platform_series_id = GSE58118
!Platform_series_id = GSE58295
!Platform_series_id = GSE58324
!Platform_series_id = GSE58397
!Platform_series_id = GSE58473
!Platform_series_id = GSE58542
!Platform_series_id = GSE58574
!Platform_series_id = GSE58791
!Platform_series_id = GSE58903
!Platform_series_id = GSE58940
!Platform_series_id = GSE58975
!Platform_series_id = GSE59140
!Platform_series_id = GSE59414
!Platform_series_id = GSE59660
!Platform_series_id = GSE59697
!Platform_series_id = GSE59938
!Platform_series_id = GSE60079
!Platform_series_id = GSE60128
!Platform_series_id = GSE60525
!Platform_series_id = GSE60919
!Platform_series_id = GSE60956
!Platform_series_id = GSE61124
!Platform_series_id = GSE61196
!Platform_series_id = GSE61805
!Platform_series_id = GSE61956
!Platform_series_id = GSE62105
!Platform_series_id = GSE62117
!Platform_series_id = GSE62191
!Platform_series_id = GSE62192
!Platform_series_id = GSE62224
!Platform_series_id = GSE62524
!Platform_series_id = GSE62747
!Platform_series_id = GSE62849
!Platform_series_id = GSE63029
!Platform_series_id = GSE63289
!Platform_series_id = GSE63524
!Platform_series_id = GSE63667
!Platform_series_id = GSE63859
!Platform_series_id = GSE64012
!Platform_series_id = GSE64014
!Platform_series_id = GSE64161
!Platform_series_id = GSE64163
!Platform_series_id = GSE64224
!Platform_series_id = GSE64237
!Platform_series_id = GSE64424
!Platform_series_id = GSE64586
!Platform_series_id = GSE64657
!Platform_series_id = GSE65034
!Platform_series_id = GSE65286
!Platform_series_id = GSE65954
!Platform_series_id = GSE66314
!Platform_series_id = GSE66434
!Platform_series_id = GSE66626
!Platform_series_id = GSE66649
!Platform_series_id = GSE66770
!Platform_series_id = GSE66886
!Platform_series_id = GSE66887
!Platform_series_id = GSE66888
!Platform_series_id = GSE67536
!Platform_series_id = GSE67636
!Platform_series_id = GSE67638
!Platform_series_id = GSE67887
!Platform_series_id = GSE67899
!Platform_series_id = GSE68081
!Platform_series_id = GSE68089
!Platform_series_id = GSE68215
!Platform_series_id = GSE68497
!Platform_series_id = GSE68531
!Platform_series_id = GSE68532
!Platform_series_id = GSE68809
!Platform_series_id = GSE68852
!Platform_series_id = GSE69534
!Platform_series_id = GSE69712
!Platform_series_id = GSE69980
!Platform_series_id = GSE70403
!Platform_series_id = GSE70905
!Platform_series_id = GSE70951
!Platform_series_id = GSE71769
!Platform_series_id = GSE72035
!Platform_series_id = GSE72585
!Platform_series_id = GSE72916
!Platform_series_id = GSE73089
!Platform_series_id = GSE73521
!Platform_series_id = GSE73556
!Platform_series_id = GSE73577
!Platform_series_id = GSE73953
!Platform_series_id = GSE74634
!Platform_series_id = GSE74635
!Platform_series_id = GSE74711
!Platform_series_id = GSE74752
!Platform_series_id = GSE74786
!Platform_series_id = GSE74895
!Platform_series_id = GSE75650
!Platform_series_id = GSE75678
!Platform_series_id = GSE75685
!Platform_series_id = GSE75766
!Platform_series_id = GSE76392
!Platform_series_id = GSE76809
!Platform_series_id = GSE77752
!Platform_series_id = GSE78250
!Platform_series_id = GSE78714
!Platform_series_id = GSE79292
!Platform_series_id = GSE79330
!Platform_series_id = GSE79478
!Platform_series_id = GSE79482
!Platform_series_id = GSE79579
!Platform_series_id = GSE79627
!Platform_series_id = GSE79629
!Platform_series_id = GSE79689
!Platform_series_id = GSE81058
!Platform_series_id = GSE81371
!Platform_series_id = GSE81589
!Platform_series_id = GSE81665
!Platform_series_id = GSE82233
!Platform_series_id = GSE82278
!Platform_series_id = GSE83519
!Platform_series_id = GSE83878
!Platform_series_id = GSE83879
!Platform_series_id = GSE83880
!Platform_series_id = GSE83881
!Platform_series_id = GSE83883
!Platform_series_id = GSE85698
!Platform_series_id = GSE85907
!Platform_series_id = GSE86062
!Platform_series_id = GSE86099
!Platform_series_id = GSE86115
!Platform_series_id = GSE86265
!Platform_series_id = GSE86266
!Platform_series_id = GSE87000
!Platform_series_id = GSE87674
!Platform_series_id = GSE87778
!Platform_series_id = GSE87910
!Platform_series_id = GSE89287
!Platform_series_id = GSE89422
!Platform_series_id = GSE89915
!Platform_series_id = GSE90132
!Platform_series_id = GSE90605
!Platform_series_id = GSE92915
!Platform_series_id = GSE93899
!Platform_series_id = GSE93900
!Platform_series_id = GSE94610
!Platform_series_id = GSE95000
!Platform_series_id = GSE95084
!Platform_series_id = GSE96671
!Platform_series_id = GSE98021
!Platform_series_id = GSE98737
!Platform_series_id = GSE100533
!Platform_series_id = GSE102265
!Platform_series_id = GSE102267
!Platform_series_id = GSE102641
!Platform_series_id = GSE103236
!Platform_series_id = GSE106206
!Platform_series_id = GSE107200
!Platform_series_id = GSE109009
!Platform_series_id = GSE109848
!Platform_series_id = GSE110905
Ok so it looks like the next thing to be done for this is to go through that list of accessions, pull the protocol for each experiment from the above listed protocols endpoint, and determine if any are Agilent 1-color. I believe the heuristic for determining 1-color vs 2-color is that if "Cy5" appears anywhere in the protocol information then it is a 2-color experiment.
@Miserlou or @jaclyn-taroni can you confirm if all of the above is accurate?
Yep, there should also be some fields that reference "Channel 2" (often ending in _ch2
in GEO) that are non-empty in 2-color experiments.
Here is the statistics of the 756 experiments:
[7701, 7702, 9067, 9187, 10057, 10107, 10864, 10956, 10959, 11132, 11242, 11968, 12307, 13470, 13566, 14409, 14681, 14910, 14982, 15109, 15112, 16026, 16113, 16641, 16872, 16945, 17403, 17623, 17630, 17632, 17839, 17842, 17843, 17860, 17992, 18316, 18390, 18457, 18693, 19324, 19494, 19939, 19992, 20147, 20171, 20937, 20941, 20988, 20993, 21209, 21367, 21501, 21586, 22226, 22384, 22430, 22586, 22778, 22900, 22901, 23019, 23113, 23171, 23209, 23536, 23669, 23688, 23689, 23922, 24171, 24432, 24732, 24782, 24908, 24951, 25167, 25193, 25289, 25346, 25453, 26088, 26089, 26129, 26322, 26721, 26856, 26979, 27183, 27254, 27842, 27900, 28045, 28073, 28253, 28501, 28522, 28650, 28658, 28748, 28907, 29270, 29606, 29608, 29861, 29886, 29917, 30114, 30131, 30132, 30171, 30181, 30475, 30904, 30961, 31003, 31195, 31322, 31589, 31965, 32143, 32144, 32150, 32371, 32413, 33012, 33224, 33277, 33290, 33526, 33673, 33723, 33731, 33812, 33910, 34077, 34153, 34429, 34499, 34792, 34881, 34940, 35002, 35142, 35163, 35168, 35313, 35494, 35576, 35733, 35749, 35753, 35757, 35800, 35814, 35982, 35994, 36207, 36549, 37110, 37116, 37117, 37277, 37326, 37888, 37957, 38227, 38241, 38242, 39200, 39202, 39477, 39764, 40047, 40682, 41034, 41110, 41255, 41651, 41653, 41744, 41781, 42256, 42357, 42401, 42402, 42667, 42668, 42879, 43049, 43467, 43674, 43962, 43973, 44290, 44729, 45245, 45340, 45403, 45531, 45596, 46471, 46477, 46973, 46974, 47147, 47435, 48132, 48133, 49175, 49288, 49594, 49900, 50619, 50939, 50988, 51081, 51086, 51087, 51433, 51748, 52061, 52100, 52292, 52602, 53014, 53175, 53236, 53270, 54258, 54635, 54872, 54898, 55063, 55064, 55065, 55668, 55669, 55787, 56116, 56519, 56618, 57273, 57571, 57756, 57825, 58295, 58397, 58473, 58574, 58791, 59414, 59938, 60525, 60919, 61956, 62105, 62224, 63029, 63289, 63524, 63667, 63859, 64224, 64237, 65954, 66314, 66434, 66770, 68081, 68089, 68215, 68531, 69712, 70403, 71769, 73089, 74634, 74635, 74786, 75678, 75685, 75766, 76809, 79478, 79629]
[7900, 8993, 13407, 15075, 15076, 15212, 17753, 17924, 20298, 20680, 20681, 20686, 20750, 20842, 21201, 21202, 21328, 21565, 21792, 21886, 21959, 22030, 22032, 22323, 22866, 22891, 23074, 23131, 23773, 23803, 23804, 23807, 23901, 23903, 24100, 24231, 24240, 24268, 24731, 24876, 25200, 25623, 25624, 25844, 25935, 25936, 26259, 26321, 26855, 26857, 27173, 27335, 27503, 27915, 28038, 28230, 28456, 28478, 28615, 28623, 28628, 28813, 28818, 28877, 28912, 29090, 29141, 29288, 29405, 29507, 29746, 29760, 29869, 30105, 30107, 30432, 30592, 30664, 30994, 31093, 31147, 31286, 31360, 31904, 31981, 32221, 32441, 32645, 32709, 32915, 33142, 33264, 33267, 33271, 33272, 33273, 33615, 33755, 33824, 34007, 34131, 34252, 34487, 34527, 35133, 35141, 35454, 35500, 35756, 36082, 36267, 36654, 36758, 36854, 36931, 37087, 37170, 37257, 38330, 38544, 38581, 38959, 39199, 39493, 39745, 39768, 40383, 40384, 40385, 40386, 40808, 41436, 42099, 42643, 43219, 43611, 44066, 44426, 44987, 45158, 45251, 45422, 46021, 46314, 46408, 46670, 47830, 48080, 48384, 48399, 48838, 49578, 49657, 50395, 51561, 51999, 53104, 53181, 53791, 53792, 53872, 54712, 55015, 55024, 55563, 55565, 56103, 56363, 56946, 57473, 57474, 58118, 58940, 58975, 59140, 59697, 60128, 60956, 61805, 62117, 62747, 64012, 64014, 67887, 68497, 69534, 78714, 79627, 83883]
[9077, 19362, 49969, 49974, 50784]
[9869, 10455, 10541, 10613, 11682, 12384, 12385, 13334, 13365, 13919, 14097, 14312, 14490, 15812, 15948, 16957, 17311, 18109, 18438, 18439, 20906, 20945, 21280, 21284, 22265, 22775, 23169, 23363, 24020, 24370, 26411, 26692, 26812, 26993, 26996, 28000, 28300, 28400, 28401, 29000, 30023, 31095, 31425, 31426, 31427, 31802, 32026, 32388, 33093, 34228, 34291, 34396, 35311, 35477, 37738, 38974, 39400, 40185, 41483, 41502, 41617, 41752, 42520, 42619, 45357, 45371, 45960, 47511, 47513, 51059, 51060, 54033, 54981, 55288, 55723, 56573, 57341, 57343, 58542, 62191, 65034, 70905, 70951, 73521, 73556, 74711, 74752]
[7902, 8353, 9561, 10164, 10570, 10667, 10863, 10868, 11173, 11205, 11233, 11946, 11985, 12075, 12114, 12405, 12553, 12928, 13216, 13286, 13834, 13886, 14028, 14048, 14261, 14476, 14560, 14617, 14839, 14853, 14972, 15359, 15549, 15576, 16053, 16065, 16123, 16358, 16532, 16727, 17018, 17594, 17766, 18102, 18138, 18612, 18689, 18817, 18844, 18849, 18874, 18875, 18966, 18971, 19541, 19712, 19716, 19717, 19718, 19853, 20028, 20127, 20506, 20690, 20936, 22085, 23989, 24883, 26106, 27616, 27619, 29801, 31277, 31728, 32220, 32456, 34303, 37575, 39847, 40206, 40315, 44133, 44135, 44941, 45404, 45763, 48211, 48265, 48847, 50494, 50911, 51029, 51617, 51622, 51624, 52211, 52212, 53180, 54083, 54171, 57259, 58324, 58903, 59660, 60079, 61124, 61196, 62192, 62524, 62849, 64161, 64163, 64424, 64586, 64657, 65286, 66626, 66649, 66886, 66887, 66888, 67536, 67636, 67638, 67899, 68532, 68809, 68852, 69980, 72035, 72585, 72916, 73577, 73953, 74895, 75650, 76392, 77752, 78250, 79292, 79330, 79482, 79579, 79689, 81058, 81371, 81589, 81665, 82233, 82278, 83519, 83878, 83879, 83880, 83881, 85698, 85907, 86062, 86099, 86115, 86265, 86266, 87000, 87674, 87778, 87910, 89287, 89422, 89915, 90132, 90605, 92915, 93899, 93900, 94610, 95000, 95084, 96671, 98021, 98737, 100533, 102265, 102267, 102641, 103236, 106206, 107200, 109009, 109848, 110905]
_ch2
substring.Awesome, thanks Dongbo! Cy5 only.. 🤔
This was through ArrayExpress, correct @dongbohu ? If so, I wouldn't have expected the _ch2
to be in the protocol info.
There are some gray areas though. For example, 9077
is in Cy5 only
category, but it has such a substring:
... and cyanine 3-labeled CTP
https://www.ebi.ac.uk/arrayexpress/json/v3/experiments/E-GEOD-9077/protocols
Should we count it as both Cy3 and Cy5
instead?
Another example: 74711
is in neither Cy3 nor Cy5
category, but it includes this string:
10.0 mM Cyanine 3- or 5-labeled CTP
https://www.ebi.ac.uk/arrayexpress/json/v3/experiments/E-GEOD-74711/protocols
Does this mean it should be in both Cy3 and Cy5
category?
@jaclyn-taroni Yes, all URLs are in this format:
https://www.ebi.ac.uk/arrayexpress/json/v3/experiments/E-GEOD-xxxx/protocols
Okay, I will go through the two "gray area" examples you posted @dongbohu and a handful of the neither Cy3 nor Cy5 examples (possibly the no protocol examples...) and post some thoughts about how I detect one-color vs. two-color as a human.
Here, I've chosen to go through protocols. It is quite possible that examining the raw data files (if available) would tell us the answer more definitively. (Although based on the number of different formats accommodated by limma::read.maimages
, I think this may be challenging.)
I am also assuming that it would be preferrable to glean what processor to use from the protocol info prior to raw data download, but if that is not the case we can take a different approach.
When in doubt, we can check the sdrf.txt
files (this actually may turn out to be the most robust way to go...)
Verdict: One-color
Why: One-Color RNA Spike-In RNA
and One-Color Low RNA Input Linear Amplification Kit PLUS
in P-GSE28300-5
; One-Color Microarray-Based Gene Expression Analysis Protocol
in P-GSE28300-6
Verdict: Two-color
Why: Two-Color Microarray-Based Gene Expression Analysis
in P-GSE21280-3
; log2(Experiment/Control)
in P-GSE21280-1
; reference to loess
normalization in multiple protocols is a hint
Verdict: Two-color
Why: log(PSA treated cells/control cells)
, CH1_SIG_MEAN
, CH2_SIG_MEAN
in P-GSE9869-1
; similar info in another protocol
Verdict: One-color
Why: One-Color Microarray-Based Gene Expression Analysis
in multiple protocols
Verdict: One-color
Why: Cyanine 3-CTP
& no mention of Cyanine 5
in P-GSE34228-5
Verdict: Two-color
Why: Agilent Two-Color Microarray-Based Gene Expression Analysis
in multiple protocols; also Normalized log2 ratios (test/reference)
and loess normalization
in P-GSE58542-1
are strong hints
Verdict: Two-color
Why: normalized log10 ratio (treated/untreated)
in P-GSE23169-1
; also for each unique GSMXXXXX
in Source Name
in the sample-data table (sdrf.txt
)there is a 1
and a 2
Verdict: One-color Why: Can't really tell for sure from protocols, had to look at sample-data table
This is a SuperSeries, GSE28400
is the relevant subseries.
Verdict: Two-color
Why: log2-transformed ratio (HEK-293 miR-204/HEK-293 control)
in P-GSE28400-1
, sample accessions ending in 1
and 2
in the sample-data table
Verdict: One-color
Why: Agilent One-Color Microarray-Based Gene Expression Analysis
in P-GSE12385-8
; Agilent One-Color RNA Spike-In RNA
in P-GSE12385-7
My recommendations for further exploration: check for One-Color
and Two-Color
in the protocols, look for the same accession with 1
and 2
in the sample-data relationship info
My comment above is based on ArrayExpress. @kurtwheeler, @Miserlou, @dongbohu is the plan to only retrieve GEO data from GEO when it is unavailable on ArrayExpress?
Taking a one-color and two-color example from above and looking at GEO --
Another example: 74711 is in neither Cy3 nor Cy5 category, but it includes this string:
10.0 mM Cyanine 3- or 5-labeled CTP
https://www.ebi.ac.uk/arrayexpress/json/v3/experiments/E-GEOD-74711/protocols Does this mean it should be in bothCy3 and Cy5
category?
Yes.
There are some gray areas though. For example, 9077 is in Cy5 only category, but it has such a substring:
... and cyanine 3-labeled CTP
https://www.ebi.ac.uk/arrayexpress/json/v3/experiments/E-GEOD-9077/protocols Should we count it as bothCy3 and Cy5
instead?
E-GEOD-9077
is one of the stranger experiments I've come across! There are three platforms and it's not a SuperSeries -- I would probably ignore it.
is the plan to only retrieve GEO data from GEO when it is unavailable on ArrayExpress
I think we should get it from wherever the metadata is better. I think I've been assuming that GEO data would have better metadata on GEO than ArrayExpress since it's not secondhand, but @Miserlou has worked more with the GEO metadata so I bet he knows for sure.
We'll need to be able to detect one- and two-color experiments from both ArrayExpress and GEO
E-GEOD-19362
- One channel, Cy5-only E-GEOD-49969
- This looks like it uses a reference based on the Description
+ Cy3
in the sample-data table, but there's only one sample per GSM
accession. This is the kind of thing I would probably want to grab the submitter-processed data for unless it was often requested.E-GEOD-49974
is a SuperSeries that includes E-GEOD-49969
E-GEOD-50784
- Same submitter as E-GEOD-49969
+ similar wording, but can tell it is two-color based on the sample-data tableI randomly took a look at 5 experiments with no protocol information. They are likely not replicated in ArrayExpress for whatever reason.
One color and two color experiments will be the same "platform" in GEO or ArrayExpress -- there should be some metadata field that indicates whether a second channel is present