fpbarthel / GLASS

GLASS consortium
MIT License
37 stars 13 forks source link

Portion IDs in multisector samples #71

Closed Kcjohnson closed 5 years ago

Kcjohnson commented 6 years ago

There was a small hiccup. The portion IDs for some of the multisector TCGA and Samsung samples inherited the same number even though there were in fact from multiple portions.

Some examples include: TCGA-06-0221-TP GLSS-SM-R110-TP

In the database these can be found using the following command:

SELECT sample_barcode, aliquot_portion, aliquot_analysis_type, COUNT( * ) FROM biospecimen.aliquots GROUP BY sample_barcode, aliquot_portion, aliquot_analysis_type ORDER BY count DESC

Since we do not currently have plans to use the multi-sector samples in our final analysis, this revision will take a lower priority.

Kcjohnson commented 6 years ago

After noting the issues with the portion IDs above, we went back to fix the barcodes in the Samsung (SM) and TCGA WXS samples. These can be seen in commits #72 and #73.

The main changes were shifting the portion IDs for a select few Samsung cases for which there was multisector data:

GLSS-SM-R098-R1-01D >> GLSS-SM-R098-R1-02D GLSS-SM-R101-R1-01D >> GLSS-SM-R101-R1-02D GLSS-SM-R101-TP-01D >> GLSS-SM-R101-TP-02D GLSS-SM-R105-R1-01D >> GLSS-SM-R105-R1-02D GLSS-SM-R110-TP-01D >> GLSS-SM-R110-TP-02D

As well as many of the TCGA WXS samples that either did not lift over the appropriate portion ID from the previous cohort or we identified that there was multisector data:

TCGA-06-0171-TP-02D >> TCGA-06-0171-TP-03D

We also noted that one of the samples was a typo as entered through excel. In fact, these were all typographical errors and systematic error is not a concern. @fpbarthel let's discuss how to best incorporate these changes into the database while minimizing interruptions to the current workflow?

TCGA-06-0221-NB >> TCGA-06-0211-NB

Kcjohnson commented 5 years ago

For @fsvarn these are the samples for which their IDs will be fixed:

portion_ids_to_fix.txt

fpbarthel commented 5 years ago

The barcodes of the 63 samples with only a portion change has been submitted and is processing. These updates should automatically update and change related tables, eg. biospecimen.readgroups, analysis.files, etc.

UPDATE biospecimen.aliquots SET   aliquot_barcode = 'GLSS-SM-R098-R1-02D-WXS-462GI8', aliquot_portion = '02'   WHERE aliquot_barcode = 'GLSS-SM-R098-R1-01D-WXS-462GI8';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'GLSS-SM-R101-R1-02D-WXS-PUTOC9', aliquot_portion = '02'   WHERE aliquot_barcode = 'GLSS-SM-R101-R1-01D-WXS-PUTOC9';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'GLSS-SM-R101-TP-02D-WXS-TKL136', aliquot_portion = '02'   WHERE aliquot_barcode = 'GLSS-SM-R101-TP-01D-WXS-TKL136';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'GLSS-SM-R105-R1-02D-WXS-Y9SY6I', aliquot_portion = '02'   WHERE aliquot_barcode = 'GLSS-SM-R105-R1-01D-WXS-Y9SY6I';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'GLSS-SM-R110-TP-02D-WXS-BQFEME', aliquot_portion = '02'   WHERE aliquot_barcode = 'GLSS-SM-R110-TP-01D-WXS-BQFEME';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-06-0125-R1-02D-WXS-FB1H59', aliquot_portion = '02'   WHERE aliquot_barcode = 'TCGA-06-0125-R1-01D-WXS-FB1H59';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-06-0125-TP-02D-WXS-FQUL0U', aliquot_portion = '02'   WHERE aliquot_barcode = 'TCGA-06-0125-TP-01D-WXS-FQUL0U';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-06-0152-TP-03D-WXS-828IO0', aliquot_portion = '03'   WHERE aliquot_barcode = 'TCGA-06-0152-TP-01D-WXS-828IO0';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-06-0152-R1-02D-WXS-P9JX25', aliquot_portion = '02'   WHERE aliquot_barcode = 'TCGA-06-0152-R1-01D-WXS-P9JX25';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-06-0171-TP-03D-WXS-GBOF7P', aliquot_portion = '03'   WHERE aliquot_barcode = 'TCGA-06-0171-TP-01D-WXS-GBOF7P';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-06-0171-R1-02D-WXS-XY0A8T', aliquot_portion = '02'   WHERE aliquot_barcode = 'TCGA-06-0171-R1-01D-WXS-XY0A8T';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-06-0190-R1-02D-WXS-LE9JNK', aliquot_portion = '02'   WHERE aliquot_barcode = 'TCGA-06-0190-R1-01D-WXS-LE9JNK';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-06-0210-R1-02D-WXS-5M0ICH', aliquot_portion = '02'   WHERE aliquot_barcode = 'TCGA-06-0210-R1-01D-WXS-5M0ICH';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-06-0211-R1-03D-WXS-M22HJ0', aliquot_portion = '03'   WHERE aliquot_barcode = 'TCGA-06-0211-R1-01D-WXS-M22HJ0';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-06-0211-R2-02D-WXS-KTDRP8', aliquot_portion = '02'   WHERE aliquot_barcode = 'TCGA-06-0211-R2-01D-WXS-KTDRP8';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-06-0221-R1-02D-WXS-6RGOIL', aliquot_portion = '02'   WHERE aliquot_barcode = 'TCGA-06-0221-R1-01D-WXS-6RGOIL';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-06-0221-TP-02D-WXS-PKKKW4', aliquot_portion = '02'   WHERE aliquot_barcode = 'TCGA-06-0221-TP-01D-WXS-PKKKW4';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-19-0957-TP-02D-WXS-2GJCQI', aliquot_portion = '02'   WHERE aliquot_barcode = 'TCGA-19-0957-TP-01D-WXS-2GJCQI';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-19-1389-R1-02D-WXS-TWOEYE', aliquot_portion = '02'   WHERE aliquot_barcode = 'TCGA-19-1389-R1-01D-WXS-TWOEYE';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-19-4065-TP-02D-WXS-J38ABH', aliquot_portion = '02'   WHERE aliquot_barcode = 'TCGA-19-4065-TP-01D-WXS-J38ABH';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-19-4065-R1-02D-WXS-X3LRVW', aliquot_portion = '02'   WHERE aliquot_barcode = 'TCGA-19-4065-R1-01D-WXS-X3LRVW';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-14-1402-R1-02D-WXS-M1J3PC', aliquot_portion = '02'   WHERE aliquot_barcode = 'TCGA-14-1402-R1-01D-WXS-M1J3PC';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-14-1402-TP-02D-WXS-EY7S1Q', aliquot_portion = '02'   WHERE aliquot_barcode = 'TCGA-14-1402-TP-01D-WXS-EY7S1Q';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-06-0125-R1-11D-WXS-8Q4RKD', aliquot_portion = '11'   WHERE aliquot_barcode = 'TCGA-06-0125-R1-01D-WXS-8Q4RKD';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-06-0152-TP-02D-WXS-6DXTU8', aliquot_portion = '02'   WHERE aliquot_barcode = 'TCGA-06-0152-TP-01D-WXS-6DXTU8';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-06-0171-R1-11D-WXS-87WAQK', aliquot_portion = '11'   WHERE aliquot_barcode = 'TCGA-06-0171-R1-01D-WXS-87WAQK';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-06-0171-TP-02D-WXS-CY280C', aliquot_portion = '02'   WHERE aliquot_barcode = 'TCGA-06-0171-TP-01D-WXS-CY280C';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-06-0211-R1-02D-WXS-X1GNKU', aliquot_portion = '02'   WHERE aliquot_barcode = 'TCGA-06-0211-R1-01D-WXS-X1GNKU';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-06-0221-R1-11D-WXS-UAUEGN', aliquot_portion = '11'   WHERE aliquot_barcode = 'TCGA-06-0221-R1-01D-WXS-UAUEGN';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-19-0957-R1-11D-WXS-PQFJEA', aliquot_portion = '11'   WHERE aliquot_barcode = 'TCGA-19-0957-R1-01D-WXS-PQFJEA';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-19-1389-R1-21D-WXS-YNQ3L5', aliquot_portion = '21'   WHERE aliquot_barcode = 'TCGA-19-1389-R1-01D-WXS-YNQ3L5';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-19-4065-R1-11D-WXS-ET7919', aliquot_portion = '11'   WHERE aliquot_barcode = 'TCGA-19-4065-R1-01D-WXS-ET7919';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-TQ-A7RK-R1-11D-WXS-DW82J0', aliquot_portion = '11'   WHERE aliquot_barcode = 'TCGA-TQ-A7RK-R1-01D-WXS-DW82J0';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-TQ-A7RK-TP-11D-WXS-6B1EKP', aliquot_portion = '11'   WHERE aliquot_barcode = 'TCGA-TQ-A7RK-TP-01D-WXS-6B1EKP';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-TQ-A7RV-TP-21D-WXS-IODOZZ', aliquot_portion = '21'   WHERE aliquot_barcode = 'TCGA-TQ-A7RV-TP-01D-WXS-IODOZZ';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-TQ-A7RV-R1-11D-WXS-CHU5VI', aliquot_portion = '11'   WHERE aliquot_barcode = 'TCGA-TQ-A7RV-R1-01D-WXS-CHU5VI';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-TQ-A8XE-R1-11D-WXS-N53TK5', aliquot_portion = '11'   WHERE aliquot_barcode = 'TCGA-TQ-A8XE-R1-01D-WXS-N53TK5';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-TQ-A8XE-TP-11D-WXS-UOHP3W', aliquot_portion = '11'   WHERE aliquot_barcode = 'TCGA-TQ-A8XE-TP-01D-WXS-UOHP3W';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-TM-A7CF-R1-11D-WXS-4PZL6G', aliquot_portion = '11'   WHERE aliquot_barcode = 'TCGA-TM-A7CF-R1-01D-WXS-4PZL6G';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-TM-A7CF-TP-11D-WXS-S9DTP5', aliquot_portion = '11'   WHERE aliquot_barcode = 'TCGA-TM-A7CF-TP-01D-WXS-S9DTP5';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-FG-5963-TP-11D-WXS-BX105O', aliquot_portion = '11'   WHERE aliquot_barcode = 'TCGA-FG-5963-TP-01D-WXS-BX105O';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-FG-5963-R1-12D-WXS-XI05S1', aliquot_portion = '12'   WHERE aliquot_barcode = 'TCGA-FG-5963-R1-01D-WXS-XI05S1';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-FG-5965-R2-11D-WXS-34FYRV', aliquot_portion = '11'   WHERE aliquot_barcode = 'TCGA-FG-5965-R2-01D-WXS-34FYRV';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-FG-5965-R1-11D-WXS-10UZ0F', aliquot_portion = '11'   WHERE aliquot_barcode = 'TCGA-FG-5965-R1-01D-WXS-10UZ0F';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-FG-5965-TP-11D-WXS-00A7KM', aliquot_portion = '11'   WHERE aliquot_barcode = 'TCGA-FG-5965-TP-01D-WXS-00A7KM';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-FG-A4MT-R1-11D-WXS-HH7X54', aliquot_portion = '11'   WHERE aliquot_barcode = 'TCGA-FG-A4MT-R1-01D-WXS-HH7X54';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-FG-A4MT-TP-11D-WXS-RN1S3I', aliquot_portion = '11'   WHERE aliquot_barcode = 'TCGA-FG-A4MT-TP-01D-WXS-RN1S3I';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-DU-5870-R1-12D-WXS-VABWLK', aliquot_portion = '12'   WHERE aliquot_barcode = 'TCGA-DU-5870-R1-01D-WXS-VABWLK';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-DU-5870-TP-11D-WXS-LMTCYR', aliquot_portion = '11'   WHERE aliquot_barcode = 'TCGA-DU-5870-TP-01D-WXS-LMTCYR';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-DU-5872-TP-11D-WXS-9CEQT3', aliquot_portion = '11'   WHERE aliquot_barcode = 'TCGA-DU-5872-TP-01D-WXS-9CEQT3';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-DU-5872-R1-21D-WXS-GXUHO1', aliquot_portion = '21'   WHERE aliquot_barcode = 'TCGA-DU-5872-R1-01D-WXS-GXUHO1';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-DU-6397-TP-11D-WXS-ZZXZ4F', aliquot_portion = '11'   WHERE aliquot_barcode = 'TCGA-DU-6397-TP-01D-WXS-ZZXZ4F';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-DU-6397-R1-12D-WXS-0EDX5N', aliquot_portion = '12'   WHERE aliquot_barcode = 'TCGA-DU-6397-R1-01D-WXS-0EDX5N';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-DU-6404-R1-21D-WXS-ZOOJ7F', aliquot_portion = '21'   WHERE aliquot_barcode = 'TCGA-DU-6404-R1-01D-WXS-ZOOJ7F';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-DU-6404-R2-11D-WXS-CCY5XQ', aliquot_portion = '11'   WHERE aliquot_barcode = 'TCGA-DU-6404-R2-01D-WXS-CCY5XQ';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-DU-6404-TP-11D-WXS-Y06ZOW', aliquot_portion = '11'   WHERE aliquot_barcode = 'TCGA-DU-6404-TP-01D-WXS-Y06ZOW';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-DU-6407-R1-12D-WXS-MN1HI8', aliquot_portion = '12'   WHERE aliquot_barcode = 'TCGA-DU-6407-R1-01D-WXS-MN1HI8';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-DU-6407-TP-13D-WXS-H4GGSN', aliquot_portion = '13'   WHERE aliquot_barcode = 'TCGA-DU-6407-TP-01D-WXS-H4GGSN';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-DU-6407-R2-11D-WXS-0SX16B', aliquot_portion = '11'   WHERE aliquot_barcode = 'TCGA-DU-6407-R2-01D-WXS-0SX16B';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-DU-7304-TP-12D-WXS-2RI95C', aliquot_portion = '12'   WHERE aliquot_barcode = 'TCGA-DU-7304-TP-01D-WXS-2RI95C';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-DU-7304-R1-12D-WXS-WGFU0U', aliquot_portion = '12'   WHERE aliquot_barcode = 'TCGA-DU-7304-R1-01D-WXS-WGFU0U';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-DH-A669-TP-12D-WXS-EFVDA0', aliquot_portion = '12'   WHERE aliquot_barcode = 'TCGA-DH-A669-TP-01D-WXS-EFVDA0';
UPDATE biospecimen.aliquots SET   aliquot_barcode = 'TCGA-DH-A669-R1-11D-WXS-DVQOHB', aliquot_portion = '11'   WHERE aliquot_barcode = 'TCGA-DH-A669-R1-01D-WXS-DVQOHB';
fpbarthel commented 5 years ago

The sample_barcode update is also running:


UPDATE   biospecimen.aliquots SET aliquot_barcode = 'TCGA-06-0211-NB-01D-WXS-B2K1SS',   sample_barcode = 'TCGA-06-0211-NB' WHERE aliquot_barcode =   'TCGA-06-0221-NB-01D-WXS-B2K1SS';
fpbarthel commented 5 years ago

Updated biospecimen.aliquots table. Related tables were automatically updated.

To-do:

fpbarthel commented 5 years ago

Updated analysis.files:

UPDATE analysis.files SET file_size =   14933115455, file_md5sum = '53e98abd9e6c668c3cd54e83b09e3f71', file_name =   'GLSS-SM-R098-R1-02D-WXS-462GI8.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/GLSS-SM-R098-R1-02D-WXS-462GI8.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'GLSS-SM-R098-R1-02D-WXS-462GI8';
UPDATE analysis.files SET file_size =   16729259124, file_md5sum = '8154d22a6afe15475a71b556bb8ea57b', file_name =   'GLSS-SM-R101-R1-02D-WXS-PUTOC9.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/GLSS-SM-R101-R1-02D-WXS-PUTOC9.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'GLSS-SM-R101-R1-02D-WXS-PUTOC9';
UPDATE analysis.files SET file_size =   12806978197, file_md5sum = '6772ba8c956f48a8d93a6a262fee8818', file_name =   'GLSS-SM-R101-TP-02D-WXS-TKL136.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/GLSS-SM-R101-TP-02D-WXS-TKL136.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'GLSS-SM-R101-TP-02D-WXS-TKL136';
UPDATE analysis.files SET file_size =   13156833726, file_md5sum = '4d487dbbd23d1316574b531595003d70', file_name =   'GLSS-SM-R105-R1-02D-WXS-Y9SY6I.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/GLSS-SM-R105-R1-02D-WXS-Y9SY6I.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'GLSS-SM-R105-R1-02D-WXS-Y9SY6I';
UPDATE analysis.files SET file_size =   13074479664, file_md5sum = 'de921058be83a97fe204401adeb03f33', file_name =   'GLSS-SM-R110-TP-02D-WXS-BQFEME.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/GLSS-SM-R110-TP-02D-WXS-BQFEME.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'GLSS-SM-R110-TP-02D-WXS-BQFEME';
UPDATE analysis.files SET file_size =   48881641541, file_md5sum = 'd5b745658087c185353c91f8a3c50779', file_name =   'TCGA-06-0125-R1-02D-WXS-FB1H59.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-06-0125-R1-02D-WXS-FB1H59.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-06-0125-R1-02D-WXS-FB1H59';
UPDATE analysis.files SET file_size =   47356517599, file_md5sum = '63a907dc728b8e432d42f32dade316e7', file_name =   'TCGA-06-0125-TP-02D-WXS-FQUL0U.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-06-0125-TP-02D-WXS-FQUL0U.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-06-0125-TP-02D-WXS-FQUL0U';
UPDATE analysis.files SET file_size =   41332969152, file_md5sum = 'd8f589f41a7694de302bb830baac9214', file_name =   'TCGA-06-0152-TP-03D-WXS-828IO0.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-06-0152-TP-03D-WXS-828IO0.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-06-0152-TP-03D-WXS-828IO0';
UPDATE analysis.files SET file_size =   49704965829, file_md5sum = '44e2f4a9500801433a017aeffcf344f4', file_name =   'TCGA-06-0152-R1-02D-WXS-P9JX25.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-06-0152-R1-02D-WXS-P9JX25.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-06-0152-R1-02D-WXS-P9JX25';
UPDATE analysis.files SET file_size =   42369244943, file_md5sum = '11efcc5a7160424c06461f57e1b183c5', file_name =   'TCGA-06-0171-TP-03D-WXS-GBOF7P.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-06-0171-TP-03D-WXS-GBOF7P.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-06-0171-TP-03D-WXS-GBOF7P';
UPDATE analysis.files SET file_size =   42570313194, file_md5sum = '5d971014ed8ed3fe7ccaa9f1d1afe94d', file_name =   'TCGA-06-0171-R1-02D-WXS-XY0A8T.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-06-0171-R1-02D-WXS-XY0A8T.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-06-0171-R1-02D-WXS-XY0A8T';
UPDATE analysis.files SET file_size =   38720881125, file_md5sum = 'b52976565b1d047a3738ec0978201b52', file_name =   'TCGA-06-0190-R1-02D-WXS-LE9JNK.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-06-0190-R1-02D-WXS-LE9JNK.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-06-0190-R1-02D-WXS-LE9JNK';
UPDATE analysis.files SET file_size =   46851792024, file_md5sum = 'c85313e21d22a90566d1f627372c40ad', file_name =   'TCGA-06-0210-R1-02D-WXS-5M0ICH.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-06-0210-R1-02D-WXS-5M0ICH.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-06-0210-R1-02D-WXS-5M0ICH';
UPDATE analysis.files SET file_size =   41409894603, file_md5sum = '6a6e2c68b015dca8277f114ab123fedc', file_name =   'TCGA-06-0211-R1-03D-WXS-M22HJ0.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-06-0211-R1-03D-WXS-M22HJ0.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-06-0211-R1-03D-WXS-M22HJ0';
UPDATE analysis.files SET file_size =   42370111089, file_md5sum = 'b46f8c53e43f5a83cde798e63b921097', file_name =   'TCGA-06-0211-R2-02D-WXS-KTDRP8.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-06-0211-R2-02D-WXS-KTDRP8.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-06-0211-R2-02D-WXS-KTDRP8';
UPDATE analysis.files SET file_size =   38883535710, file_md5sum = '904bbc4e1b21654cf2f7d0cf0ee6e1f1', file_name =   'TCGA-06-0221-R1-02D-WXS-6RGOIL.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-06-0221-R1-02D-WXS-6RGOIL.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-06-0221-R1-02D-WXS-6RGOIL';
UPDATE analysis.files SET file_size =   36611337342, file_md5sum = 'b107f77d55cca8411e0f1c57c7cb86d5', file_name =   'TCGA-06-0221-TP-02D-WXS-PKKKW4.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-06-0221-TP-02D-WXS-PKKKW4.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-06-0221-TP-02D-WXS-PKKKW4';
UPDATE analysis.files SET file_size =   30217788135, file_md5sum = 'a964b9addbf30c2472795463bc8f91a2', file_name =   'TCGA-19-0957-TP-02D-WXS-2GJCQI.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-19-0957-TP-02D-WXS-2GJCQI.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-19-0957-TP-02D-WXS-2GJCQI';
UPDATE analysis.files SET file_size =   11565257377, file_md5sum = '93f7ab98db6e6b52b233b865cfc212f6', file_name =   'TCGA-19-1389-R1-02D-WXS-TWOEYE.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-19-1389-R1-02D-WXS-TWOEYE.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-19-1389-R1-02D-WXS-TWOEYE';
UPDATE analysis.files SET file_size =   41505680488, file_md5sum = '689a8d7b09744cfac149e6ee9405c47f', file_name =   'TCGA-19-4065-TP-02D-WXS-J38ABH.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-19-4065-TP-02D-WXS-J38ABH.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-19-4065-TP-02D-WXS-J38ABH';
UPDATE analysis.files SET file_size =   2929018733, file_md5sum = 'eb39946111acbb400237c20d199ac472', file_name =   'TCGA-19-4065-R1-02D-WXS-X3LRVW.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-19-4065-R1-02D-WXS-X3LRVW.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-19-4065-R1-02D-WXS-X3LRVW';
UPDATE analysis.files SET file_size =   21570786067, file_md5sum = '6d5b9fef78ac4f7d79764dc5ae72eb86', file_name =   'TCGA-14-1402-R1-02D-WXS-M1J3PC.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-14-1402-R1-02D-WXS-M1J3PC.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-14-1402-R1-02D-WXS-M1J3PC';
UPDATE analysis.files SET file_size =   24127645575, file_md5sum = 'a783428c6857818dc45b9723cd9ebda0', file_name =   'TCGA-14-1402-TP-02D-WXS-EY7S1Q.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-14-1402-TP-02D-WXS-EY7S1Q.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-14-1402-TP-02D-WXS-EY7S1Q';
UPDATE analysis.files SET file_size =   10908259259, file_md5sum = 'b3d614d13e07fb063b1071ac7c7a9921', file_name =   'TCGA-06-0125-R1-11D-WXS-8Q4RKD.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-06-0125-R1-11D-WXS-8Q4RKD.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-06-0125-R1-11D-WXS-8Q4RKD';
UPDATE analysis.files SET file_size =   22044710018, file_md5sum = '2ebb81510517860d8d271d97804181e5', file_name =   'TCGA-06-0152-TP-02D-WXS-6DXTU8.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-06-0152-TP-02D-WXS-6DXTU8.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-06-0152-TP-02D-WXS-6DXTU8';
UPDATE analysis.files SET file_size =   11254733705, file_md5sum = '67ba2448eb2a89e165d2f66142cd9a40', file_name =   'TCGA-06-0171-R1-11D-WXS-87WAQK.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-06-0171-R1-11D-WXS-87WAQK.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-06-0171-R1-11D-WXS-87WAQK';
UPDATE analysis.files SET file_size =   19245632662, file_md5sum = 'ca6ee2d5f822856804119f2301458713', file_name =   'TCGA-06-0171-TP-02D-WXS-CY280C.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-06-0171-TP-02D-WXS-CY280C.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-06-0171-TP-02D-WXS-CY280C';
UPDATE analysis.files SET file_size =   8678647545, file_md5sum = 'a0e0e5c96adf8df5ab5dcaf3e77723f4', file_name =   'TCGA-06-0211-R1-02D-WXS-X1GNKU.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-06-0211-R1-02D-WXS-X1GNKU.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-06-0211-R1-02D-WXS-X1GNKU';
UPDATE analysis.files SET file_size =   11244052772, file_md5sum = 'fb4acd9523a346b2b81a492544ff1e91', file_name =   'TCGA-06-0221-R1-11D-WXS-UAUEGN.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-06-0221-R1-11D-WXS-UAUEGN.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-06-0221-R1-11D-WXS-UAUEGN';
UPDATE analysis.files SET file_size =   11513611471, file_md5sum = 'a77f563bab8395640da708cbad9235f0', file_name =   'TCGA-19-0957-R1-11D-WXS-PQFJEA.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-19-0957-R1-11D-WXS-PQFJEA.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-19-0957-R1-11D-WXS-PQFJEA';
UPDATE analysis.files SET file_size =   10587319193, file_md5sum = '255e28c2a158555ba38246b64940d3f0', file_name =   'TCGA-19-1389-R1-21D-WXS-YNQ3L5.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-19-1389-R1-21D-WXS-YNQ3L5.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-19-1389-R1-21D-WXS-YNQ3L5';
UPDATE analysis.files SET file_size =   10669496778, file_md5sum = '3644a59821b4a03aecc1853eb68a4337', file_name =   'TCGA-19-4065-R1-11D-WXS-ET7919.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-19-4065-R1-11D-WXS-ET7919.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-19-4065-R1-11D-WXS-ET7919';
UPDATE analysis.files SET file_size =   6135955657, file_md5sum = 'bf80cacae9a47a1aaa76caed8cfe5636', file_name =   'TCGA-TQ-A7RK-R1-11D-WXS-DW82J0.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-TQ-A7RK-R1-11D-WXS-DW82J0.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-TQ-A7RK-R1-11D-WXS-DW82J0';
UPDATE analysis.files SET file_size =   20478034664, file_md5sum = '997ddf6a33333affc352d082f685a7d9', file_name =   'TCGA-TQ-A7RK-TP-11D-WXS-6B1EKP.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-TQ-A7RK-TP-11D-WXS-6B1EKP.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-TQ-A7RK-TP-11D-WXS-6B1EKP';
UPDATE analysis.files SET file_size =   10928043757, file_md5sum = 'cdbd7eefe7881f0253f005197fb066e0', file_name =   'TCGA-TQ-A7RV-TP-21D-WXS-IODOZZ.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-TQ-A7RV-TP-21D-WXS-IODOZZ.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-TQ-A7RV-TP-21D-WXS-IODOZZ';
UPDATE analysis.files SET file_size =   8925920296, file_md5sum = '6dd5f8e3e6abd553d1f7f3cb22d7adb7', file_name =   'TCGA-TQ-A7RV-R1-11D-WXS-CHU5VI.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-TQ-A7RV-R1-11D-WXS-CHU5VI.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-TQ-A7RV-R1-11D-WXS-CHU5VI';
UPDATE analysis.files SET file_size =   6715230680, file_md5sum = '802bf534d7bf2fab5020cbe6805b46a0', file_name =   'TCGA-TQ-A8XE-R1-11D-WXS-N53TK5.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-TQ-A8XE-R1-11D-WXS-N53TK5.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-TQ-A8XE-R1-11D-WXS-N53TK5';
UPDATE analysis.files SET file_size =   9177485435, file_md5sum = 'addde035f12c093dd72e715ceff7df12', file_name =   'TCGA-TQ-A8XE-TP-11D-WXS-UOHP3W.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-TQ-A8XE-TP-11D-WXS-UOHP3W.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-TQ-A8XE-TP-11D-WXS-UOHP3W';
UPDATE analysis.files SET file_size =   6309311066, file_md5sum = 'df12feb4fcaeb14c86192e3ecf8b855a', file_name =   'TCGA-TM-A7CF-R1-11D-WXS-4PZL6G.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-TM-A7CF-R1-11D-WXS-4PZL6G.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-TM-A7CF-R1-11D-WXS-4PZL6G';
UPDATE analysis.files SET file_size =   9603746613, file_md5sum = '737deadda55e76c6dbf9f994e6c440b5', file_name =   'TCGA-TM-A7CF-TP-11D-WXS-S9DTP5.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-TM-A7CF-TP-11D-WXS-S9DTP5.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-TM-A7CF-TP-11D-WXS-S9DTP5';
UPDATE analysis.files SET file_size =   8423348918, file_md5sum = 'b249f9e4e2db40e4d41650ee0350fba3', file_name =   'TCGA-FG-5963-TP-11D-WXS-BX105O.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-FG-5963-TP-11D-WXS-BX105O.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-FG-5963-TP-11D-WXS-BX105O';
UPDATE analysis.files SET file_size =   9737760332, file_md5sum = '6c33c46d7a98761f61efc232729c1ebd', file_name =   'TCGA-FG-5963-R1-12D-WXS-XI05S1.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-FG-5963-R1-12D-WXS-XI05S1.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-FG-5963-R1-12D-WXS-XI05S1';
UPDATE analysis.files SET file_size =   10328416836, file_md5sum = '6b69e9a07a6f42e4448951543ebfae17', file_name =   'TCGA-FG-5965-R2-11D-WXS-34FYRV.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-FG-5965-R2-11D-WXS-34FYRV.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-FG-5965-R2-11D-WXS-34FYRV';
UPDATE analysis.files SET file_size =   9508670135, file_md5sum = '97124e4ff631ed3a0c96a2d586dfc65d', file_name =   'TCGA-FG-5965-R1-11D-WXS-10UZ0F.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-FG-5965-R1-11D-WXS-10UZ0F.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-FG-5965-R1-11D-WXS-10UZ0F';
UPDATE analysis.files SET file_size =   14444259280, file_md5sum = '396c583eb946ac2a866bea25a6e77905', file_name =   'TCGA-FG-5965-TP-11D-WXS-00A7KM.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-FG-5965-TP-11D-WXS-00A7KM.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-FG-5965-TP-11D-WXS-00A7KM';
UPDATE analysis.files SET file_size =   10565291330, file_md5sum = '1728abe49c33bf2a678071e487df49ba', file_name =   'TCGA-FG-A4MT-R1-11D-WXS-HH7X54.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-FG-A4MT-R1-11D-WXS-HH7X54.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-FG-A4MT-R1-11D-WXS-HH7X54';
UPDATE analysis.files SET file_size =   6080026793, file_md5sum = '07890248364e523e483b7a55d8b742b2', file_name =   'TCGA-FG-A4MT-TP-11D-WXS-RN1S3I.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-FG-A4MT-TP-11D-WXS-RN1S3I.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-FG-A4MT-TP-11D-WXS-RN1S3I';
UPDATE analysis.files SET file_size =   8361848038, file_md5sum = '9758d890232f6e244dba04c92c125caf', file_name =   'TCGA-DU-5870-R1-12D-WXS-VABWLK.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-DU-5870-R1-12D-WXS-VABWLK.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-DU-5870-R1-12D-WXS-VABWLK';
UPDATE analysis.files SET file_size =   7368101637, file_md5sum = 'd62a02a98076a91694547c406778e81b', file_name =   'TCGA-DU-5870-TP-11D-WXS-LMTCYR.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-DU-5870-TP-11D-WXS-LMTCYR.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-DU-5870-TP-11D-WXS-LMTCYR';
UPDATE analysis.files SET file_size =   8663507445, file_md5sum = '404ffc4298698b57da7f4ee2ad807afa', file_name =   'TCGA-DU-5872-TP-11D-WXS-9CEQT3.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-DU-5872-TP-11D-WXS-9CEQT3.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-DU-5872-TP-11D-WXS-9CEQT3';
UPDATE analysis.files SET file_size =   9488298894, file_md5sum = 'ae912b13880c61636ec26ad8fe9c67fb', file_name =   'TCGA-DU-5872-R1-21D-WXS-GXUHO1.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-DU-5872-R1-21D-WXS-GXUHO1.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-DU-5872-R1-21D-WXS-GXUHO1';
UPDATE analysis.files SET file_size =   8956881001, file_md5sum = '175558b9911a3f2d106f2f7afb44469f', file_name =   'TCGA-DU-6397-TP-11D-WXS-ZZXZ4F.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-DU-6397-TP-11D-WXS-ZZXZ4F.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-DU-6397-TP-11D-WXS-ZZXZ4F';
UPDATE analysis.files SET file_size =   10279824741, file_md5sum = '9009c9c9563fe896bef100e279e1b3df', file_name =   'TCGA-DU-6397-R1-12D-WXS-0EDX5N.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-DU-6397-R1-12D-WXS-0EDX5N.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-DU-6397-R1-12D-WXS-0EDX5N';
UPDATE analysis.files SET file_size =   7547685553, file_md5sum = 'd5c19ff5b2e82a145bded5d219bb3f59', file_name =   'TCGA-DU-6404-R1-21D-WXS-ZOOJ7F.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-DU-6404-R1-21D-WXS-ZOOJ7F.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-DU-6404-R1-21D-WXS-ZOOJ7F';
UPDATE analysis.files SET file_size =   6975944780, file_md5sum = '6b5d59ba7e12ee39ba31bb2e51edb774', file_name =   'TCGA-DU-6404-R2-11D-WXS-CCY5XQ.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-DU-6404-R2-11D-WXS-CCY5XQ.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-DU-6404-R2-11D-WXS-CCY5XQ';
UPDATE analysis.files SET file_size =   8304486903, file_md5sum = 'a617cc49da6adfe96304eb6c2216dca6', file_name =   'TCGA-DU-6404-TP-11D-WXS-Y06ZOW.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-DU-6404-TP-11D-WXS-Y06ZOW.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-DU-6404-TP-11D-WXS-Y06ZOW';
UPDATE analysis.files SET file_size =   8621778544, file_md5sum = 'ef2822242fb3862cebc4b0fa7e08913e', file_name =   'TCGA-DU-6407-R1-12D-WXS-MN1HI8.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-DU-6407-R1-12D-WXS-MN1HI8.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-DU-6407-R1-12D-WXS-MN1HI8';
UPDATE analysis.files SET file_size =   8668555422, file_md5sum = '6aa98a027d390b5a82fc196be2378796', file_name =   'TCGA-DU-6407-TP-13D-WXS-H4GGSN.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-DU-6407-TP-13D-WXS-H4GGSN.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-DU-6407-TP-13D-WXS-H4GGSN';
UPDATE analysis.files SET file_size =   9343547817, file_md5sum = '0d49d6cfe5504ca9853b7c691304c297', file_name =   'TCGA-DU-6407-R2-11D-WXS-0SX16B.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-DU-6407-R2-11D-WXS-0SX16B.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-DU-6407-R2-11D-WXS-0SX16B';
UPDATE analysis.files SET file_size =   9959663139, file_md5sum = 'b62ab70883cb69d050e504d2e8a024a8', file_name =   'TCGA-DU-7304-TP-12D-WXS-2RI95C.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-DU-7304-TP-12D-WXS-2RI95C.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-DU-7304-TP-12D-WXS-2RI95C';
UPDATE analysis.files SET file_size =   10677421507, file_md5sum = 'dc45a6ecfd27c48f608e76fcf5af6e51', file_name =   'TCGA-DU-7304-R1-12D-WXS-WGFU0U.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-DU-7304-R1-12D-WXS-WGFU0U.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-DU-7304-R1-12D-WXS-WGFU0U';
UPDATE analysis.files SET file_size =   9375005714, file_md5sum = 'f6086f18f139030a81708470ab56ecba', file_name =   'TCGA-DH-A669-TP-12D-WXS-EFVDA0.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-DH-A669-TP-12D-WXS-EFVDA0.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-DH-A669-TP-12D-WXS-EFVDA0';
UPDATE analysis.files SET file_size =   8462253235, file_md5sum = '2dacafcf9b7c832e0cf17e77dee97d46', file_name =   'TCGA-DH-A669-R1-11D-WXS-DVQOHB.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-DH-A669-R1-11D-WXS-DVQOHB.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-DH-A669-R1-11D-WXS-DVQOHB';
UPDATE analysis.files SET file_size =   21953238307, file_md5sum = 'dc1b8d920e62257692193321d76dd851', file_name =   'TCGA-06-0211-NB-01D-WXS-B2K1SS.realn.mdup.bqsr.bam', file_path =   '/fastscratch/verhaak-lab/GLASS-WG/results/align/bqsr/TCGA-06-0211-NB-01D-WXS-B2K1SS.realn.mdup.bqsr.bam'   WHERE file_format = 'aligned BAM' AND aliquot_barcode =   'TCGA-06-0211-NB-01D-WXS-B2K1SS';
fpbarthel commented 5 years ago

The re-named BAM files don't validate for some unknown reason, but otherwise they work fine. This is the error message:

## HISTOGRAM    java.lang.String
Error Type      Count
ERROR:INVALID_INDEX_FILE_POINTER        1

I thought it could be related to simply renaming the old indices, so I regenerated the indices, but that did not resolve the error. Perhaps there is something in the header of the BAM that is causing this. Most of the @PG tags still use the old aliquot_barcode in the input/output filenames. Either way, this doesn't seem like a big issue for now.

Let's close this as soon as we figure out how to deal with the extra pairs @Kcjohnson.

fpbarthel commented 5 years ago

Removed old BAMs