EBISPOT / gwas-user-requests

Repository to collect user requests and bug reports for the GWAS Catalog
3 stars 0 forks source link

Release sumstats that failed to release #73

Closed jdhayhurst closed 1 year ago

jdhayhurst commented 1 year ago

Reasons for failure:

  1. empty sample sheet - when expected to be complete - throwing uncaught exception
  2. Enum validation failure due to case or unstripped whitespace
  3. globus v4 failure - cannot interact with files
  4. PUT request was not made
  5. not a problem - likely to flagged due to incorrect bin assignment, e.g. GCST90202000 is in GCST90201001-GCST90202000, not GCST90202001-GCST90203000.
  6. file is invalid and was not released
  7. submission archived

The complete list of affected accessions is attached https://app.zenhub.com/files/278460613/19c49a0e-5487-4b58-aa61-304e16848c90/download

jdhayhurst commented 1 year ago

submissions (and reason for failure) in which these studies are found:

codon impl.: XKsixbrp 6Tsf65zE Tus77rxz ftp impl: ZHvCzcYP vAyPBba5 gBAHzffn hVw4zudC UQgJhk3g LcJ8iSGR

Accessions not found in sumstats db: ['GCST010509', 'GCST90250263', 'GCST90250264', 'GCST90250265', 'GCST90250266', 'GCST90250267', 'GCST90250268', 'GCST90250269', 'GCST90250270', 'GCST90250271', 'GCST90250272', 'GCST90250273', 'GCST90250274', 'GCST90250275', 'GCST90250276', 'GCST90250277', 'GCST90250278', 'GCST90250279', 'GCST90250280', 'GCST90250281', 'GCST90250282', 'GCST90250283', 'GCST90250284', 'GCST90250285', 'GCST90250286', 'GCST90250287', 'GCST90250288', 'GCST90250289', 'GCST90250290', 'GCST90250291', 'GCST90250292', 'GCST90250293', 'GCST90250294', 'GCST90250295', 'GCST90250296', 'GCST90250297', 'GCST90250298', 'GCST90250299', 'GCST90250300', 'GCST90250301', 'GCST90250302', 'GCST90250303', 'GCST90250304', 'GCST90250305', 'GCST90250306', 'GCST90250307', 'GCST90250308', 'GCST90250309', 'GCST90250310', 'GCST90250311', 'GCST90250312', 'GCST90250313', 'GCST90250314', 'GCST90250315', 'GCST90250316', 'GCST90250317', 'GCST90250318', 'GCST90250319', 'GCST90250320', 'GCST90250321', 'GCST90250322', 'GCST90250323', 'GCST90250324', 'GCST90250325', 'GCST90250326', 'GCST90250327', 'GCST90250328', 'GCST90250329', 'GCST90250330', 'GCST90250331', 'GCST90250332', 'GCST90250333', 'GCST90250334', 'GCST90250335', 'GCST90250336', 'GCST90250337', 'GCST90250338', 'GCST90250339', 'GCST90250340', 'GCST90250341', 'GCST90250342', 'GCST90250343', 'GCST90250344', 'GCST90250345', 'GCST90250346', 'GCST90250347', 'GCST90250348', 'GCST90250349', 'GCST90250350', 'GCST90250351', 'GCST90250352', 'GCST90250353', 'GCST90250354', 'GCST90250355', 'GCST90250356', 'GCST90250357', 'GCST90250358', 'GCST90250359', 'GCST90250360', 'GCST90250361', 'GCST90250362', 'GCST90250363', 'GCST90250364', 'GCST90250365', 'GCST90250366', 'GCST90250367', 'GCST90250368', 'GCST90250369', 'GCST90250370', 'GCST90250371', 'GCST90250372', 'GCST90250373', 'GCST90250374', 'GCST90250375', 'GCST90250376', 'GCST90250377', 'GCST90250378', 'GCST90250379', 'GCST90250380', 'GCST90250381', 'GCST90250382', 'GCST90250383', 'GCST90250384', 'GCST90250385', 'GCST90250386', 'GCST90250387', 'GCST90250388', 'GCST90250389', 'GCST90250390', 'GCST90250391', 'GCST90250392', 'GCST90250393', 'GCST90250394', 'GCST90250395', 'GCST90250396', 'GCST90250397', 'GCST90250398', 'GCST90250399', 'GCST90250400', 'GCST90250401', 'GCST90250402', 'GCST90250403', 'GCST90250404', 'GCST90250405', 'GCST90250406', 'GCST90250407', 'GCST90250408', 'GCST90250409', 'GCST90250410', 'GCST90250411', 'GCST90250412', 'GCST90250413', 'GCST90250414', 'GCST90250415', 'GCST90250416', 'GCST90250417', 'GCST90250418', 'GCST90250419', 'GCST90250420', 'GCST90250421', 'GCST90250422', 'GCST90250423', 'GCST90250424', 'GCST90250425', 'GCST90250426', 'GCST90250427', 'GCST90250428', 'GCST90250429', 'GCST90250430', 'GCST90250431', 'GCST90250432', 'GCST90250433', 'GCST90250434', 'GCST90250435', 'GCST90250436', 'GCST90250437', 'GCST90250438', 'GCST90250439', 'GCST90250440', 'GCST90250441', 'GCST90250442', 'GCST90250443', 'GCST90250444', 'GCST90250445', 'GCST90250446', 'GCST90250447', 'GCST90250448', 'GCST90250449', 'GCST90250450', 'GCST90250451', 'GCST90250452', 'GCST90250453', 'GCST90250454', 'GCST90250455', 'GCST90250456', 'GCST90250457', 'GCST90250458', 'GCST90250459', 'GCST90250460', 'GCST90250461', 'GCST90250462', 'GCST90250463', 'GCST90250464', 'GCST90250465', 'GCST90250466', 'GCST90250467', 'GCST90250468', 'GCST90250469', 'GCST90250470', 'GCST90250471', 'GCST90250472', 'GCST90250473', 'GCST90250474', 'GCST90250475', 'GCST90250476', 'GCST90250477', 'GCST90250478', 'GCST90250479', 'GCST90250480', 'GCST90250481', 'GCST90250482', 'GCST90250483', 'GCST90250484', 'GCST90250485', 'GCST90250486', 'GCST90250487', 'GCST90250488', 'GCST90250489', 'GCST90250490', 'GCST90250491', 'GCST90250492', 'GCST90250493', 'GCST90250494', 'GCST90250495', 'GCST90250496', 'GCST90250497', 'GCST90250498', 'GCST90250499', 'GCST90250500', 'GCST90250501', 'GCST90250502', 'GCST90250503', 'GCST90250504', 'GCST90250505', 'GCST90250506', 'GCST90250507', 'GCST90250508', 'GCST90250509', 'GCST90250510', 'GCST90250511', 'GCST90250512', 'GCST90250513', 'GCST90250514', 'GCST90250515', 'GCST90250516', 'GCST90250517', 'GCST90250518', 'GCST90250519', 'GCST90250520', 'GCST90250521', 'GCST90250522', 'GCST90250523', 'GCST90250524', 'GCST90250525', 'GCST90250526', 'GCST90250527', 'GCST90250528', 'GCST90250529', 'GCST90250530', 'GCST90250531', 'GCST90250532', 'GCST90250533', 'GCST90250534', 'GCST90250535', 'GCST90250536', 'GCST90250537', 'GCST90250538', 'GCST90250539', 'GCST90250540', 'GCST90250541', 'GCST90250542', 'GCST90250543', 'GCST90250544', 'GCST90250545', 'GCST90250546', 'GCST90250547', 'GCST90250548', 'GCST90250549', 'GCST90250550', 'GCST90250551', 'GCST90250552', 'GCST90250553', 'GCST90250554', 'GCST90250555', 'GCST90250556', 'GCST90250557', 'GCST90250558', 'GCST90250559', 'GCST90250560', 'GCST90250561', 'GCST90250562', 'GCST90250563', 'GCST90250564', 'GCST90250565', 'GCST90250566', 'GCST90250567', 'GCST90250568', 'GCST90250569', 'GCST90250570', 'GCST90250571', 'GCST90250572', 'GCST90250573', 'GCST90250574', 'GCST90250575', 'GCST90250576', 'GCST90250577', 'GCST90250578', 'GCST90250579', 'GCST90250580', 'GCST90250581', 'GCST90250582', 'GCST90250583', 'GCST90250584', 'GCST90250585', 'GCST90250586', 'GCST90250587', 'GCST90250588', 'GCST90250589', 'GCST90250590', 'GCST90250591', 'GCST90250592', 'GCST90250593', 'GCST90250594', 'GCST90250595', 'GCST90250596', 'GCST90250597', 'GCST90250598', 'GCST90250599', 'GCST90250600', 'GCST90250601', 'GCST90250602', 'GCST90250603', 'GCST90250604', 'GCST90250605', 'GCST90250606', 'GCST90250607', 'GCST90250608', 'GCST90250609', 'GCST90250610', 'GCST90250611', 'GCST90250612', 'GCST90250613', 'GCST90250614', 'GCST90250615', 'GCST90250616', 'GCST90250617', 'GCST90250618', 'GCST90250619', 'GCST90250620', 'GCST90250621', 'GCST90250622', 'GCST90250623', 'GCST90250624', 'GCST90250625', 'GCST90250626', 'GCST90250627', 'GCST90250628', 'GCST90250629', 'GCST90250630', 'GCST90250631', 'GCST90250632', 'GCST90250633', 'GCST90250634', 'GCST90250635', 'GCST90250636', 'GCST90250637', 'GCST90250638', 'GCST90250639', 'GCST90250640', 'GCST90250641', 'GCST90250642', 'GCST90250643', 'GCST90250644', 'GCST90250645', 'GCST90250646', 'GCST90250647', 'GCST90250648', 'GCST90250649', 'GCST90250650', 'GCST90250651', 'GCST90250652', 'GCST90250653', 'GCST90250654', 'GCST90250655', 'GCST90250656', 'GCST90250657', 'GCST90250658', 'GCST90250659', 'GCST90250660', 'GCST90250661', 'GCST90250662', 'GCST90250663', 'GCST90250664', 'GCST90250665', 'GCST90250666', 'GCST90250667', 'GCST90250668', 'GCST90250669', 'GCST90250670', 'GCST90250671', 'GCST90250672', 'GCST90250673', 'GCST90250674', 'GCST90250675', 'GCST90250676', 'GCST90250677', 'GCST90250678', 'GCST90250679', 'GCST90250680', 'GCST90250681', 'GCST90250682', 'GCST90250683', 'GCST90250684', 'GCST90250685', 'GCST90250686', 'GCST90250687', 'GCST90250688', 'GCST90250689', 'GCST90250690', 'GCST90250691', 'GCST90250692', 'GCST90250693', 'GCST90250694', 'GCST90250695', 'GCST90250696', 'GCST90250697', 'GCST90250698', 'GCST90250699', 'GCST90250700', 'GCST90250701', 'GCST90250702', 'GCST90250703', 'GCST90250704', 'GCST90250705', 'GCST90250706', 'GCST90250707', 'GCST90250708', 'GCST90250709', 'GCST90250710', 'GCST90250711', 'GCST90250712', 'GCST90250713', 'GCST90250714', 'GCST90250715', 'GCST90250716', 'GCST90250717', 'GCST90250718', 'GCST90250719', 'GCST90250720', 'GCST90250721', 'GCST90250722', 'GCST90250723', 'GCST90250724', 'GCST90250725', 'GCST90250726', 'GCST90250727', 'GCST90250728', 'GCST90250729', 'GCST90250730', 'GCST90250731', 'GCST90250732', 'GCST90250733', 'GCST90250734', 'GCST90250735', 'GCST90250736', 'GCST90250737', 'GCST90250738', 'GCST90250739', 'GCST90250740', 'GCST90250741', 'GCST90250742', 'GCST90250743', 'GCST90250744', 'GCST90250745', 'GCST90250746', 'GCST90250747', 'GCST90250748', 'GCST90250749', 'GCST90250750', 'GCST90250751', 'GCST90250752', 'GCST90250753', 'GCST90250754', 'GCST90250755', 'GCST90250756', 'GCST90250757', 'GCST90250758', 'GCST90250759', 'GCST90250760', 'GCST90250761', 'GCST90250762', 'GCST90250763', 'GCST90250764', 'GCST90250765', 'GCST90250766', 'GCST90250767', 'GCST90250768', 'GCST90250769', 'GCST90250770', 'GCST90250771', 'GCST90250772', 'GCST90250773', 'GCST90250774', 'GCST90250775', 'GCST90250776', 'GCST90250777', 'GCST90250778', 'GCST90250779', 'GCST90250780', 'GCST90250781', 'GCST90250782', 'GCST90250783', 'GCST90250784', 'GCST90250785', 'GCST90250786', 'GCST90250787', 'GCST90250788', 'GCST90250789', 'GCST90250790', 'GCST90250791', 'GCST90250792', 'GCST90250793', 'GCST90250794', 'GCST90250795', 'GCST90250796', 'GCST90250797', 'GCST90250798', 'GCST90250799', 'GCST90250800', 'GCST90250801', 'GCST90250802', 'GCST90250803', 'GCST90250804', 'GCST90250805', 'GCST90250806', 'GCST90250807', 'GCST90250808', 'GCST90250809', 'GCST90250810', 'GCST90250811', 'GCST90250812', 'GCST90250813', 'GCST90250814', 'GCST90250815', 'GCST90250816', 'GCST90250817', 'GCST90250818', 'GCST90269977']

jdhayhurst commented 1 year ago

The only reasons for this that I can think of for this list are 1) the gcsts were never sent to the sumstats service (this does happen), 2) these actually don’t have sumstats. I have added a list here. The first step would probably be double checking (2) and then if they should have sumstats, identifying the submission (I’d need help from @sajo-ebi or @ala-ebi for that). Once the submission has been identified for each GCST, we’ll need to recreate the PUT request to the sumstats service.

jdhayhurst commented 1 year ago

some of the files failed to release due to the expectation that sample sheets would be mandatory. As some sample sheets are blank e.g. for 6Tsf65zE, this is not handled and an uncaught exception is raised, halting the entire release of the files from that submission. Ticket created to allow blank sample sheets: EBISPOT/gwas-sumstats-service#248

jdhayhurst commented 1 year ago

Should https://www.ebi.ac.uk/gwas/studies/GCST010509 have sumstats? The callback id aUZ3nYfm was a submission reported to be for this, but the sumstats file is chrom 6 only and has non-rsids in the variant column (pre-GWAS-SSF).

jdhayhurst commented 1 year ago

same question as above, but for the range: GCST90250263 to GCST90250818, which are all part of the same submission. These were invalid.

jdhayhurst commented 1 year ago

All submissions (first comment above) accounted for.

jdhayhurst commented 1 year ago

additional studies to release (embargoed): GCST90257015 GCST90257016 GCST90257017 GCST90257018 GCST90257019 GCST90257020 GCST90257021 GCST90257022 GCST90257023 GCST90257024 GCST90257025 GCST90257026 GCST90257027 GCST90257028 GCST90257029 GCST90257030 GCST90257031 GCST90257032 GCST90257033 GCST90257034 GCST90257035 GCST90257036 GCST90257037 GCST90257038 GCST90257039 GCST90257040 GCST90257041 GCST90257042 GCST90257043 GCST90257044 GCST90257045 GCST90257046 GCST90257047 GCST90257048 GCST90257049 GCST90257050 GCST90257051 GCST90257052 GCST90257053 GCST90257054 GCST90257055 GCST90257056 GCST90257057 GCST90257058 GCST90257059 GCST90257060 GCST90257061 GCST90257062 GCST90257063 GCST90257064 GCST90257065 GCST90257066 GCST90257067 GCST90257068 GCST90257069 GCST90257070 GCST90257071 GCST90257072 GCST90257073 GCST90257074 GCST90257075 GCST90257076 GCST90257077 GCST90257078 GCST90257079 GCST90257080 GCST90257081 GCST90257082 GCST90257083 GCST90257084 GCST90257085 GCST90257086 GCST90257087 GCST90257088 GCST90257089 GCST90257090 GCST90257091 GCST90257092 GCST90257093 GCST90257094 GCST90257095 GCST90257096 GCST90257097 GCST90257098 GCST90257099 GCST90257100 GCST90257101 GCST90257102 GCST90257103 GCST90257104 GCST90257105 GCST90264121 GCST90264122 GCST90264123 GCST90264124 GCST90264125 GCST90264126 GCST90264127 GCST90264128 GCST90264129 GCST90264130 GCST90264131 GCST90264132 GCST90264133 GCST90264134 GCST90264135 GCST90264136 GCST90264137 GCST90264138 GCST90264139 GCST90264140 GCST90264141 GCST90264142 GCST90264143 GCST90264144 GCST90264145 GCST90264146 GCST90264147 GCST90264148 GCST90264149 GCST90264150 GCST90264153 GCST90266935 GCST90266936 GCST90267220 GCST90267221 GCST90267222 GCST90267223 GCST90267381 GCST90267382 GCST90267383 GCST90267996 GCST90267997 GCST90267998 GCST90268027 GCST90268028 GCST90268029 GCST90269903 GCST90269904 GCST90269905 GCST90269962 GCST90269963 GCST90269964 GCST90269965

jdhayhurst commented 1 year ago

All (above) done, except these, which are not in the sumstats db: GCST90268027-GCST90268029

ljwh2 commented 1 year ago

@ljwh2 to create new ticket for missing studies (may be due to failed PUT request)