Sequencing Information is incomplete for some files to be submitted in the second CDS release.
There are 8756 unique sequencing experiments associated with files being submitted.
The export from the dataservice with information about each of these experiemnts is here.
Platform
Accepted values:
AB Capillary
ABI Solid
BGISEQ
Complete Genomics
Helicos
Illumina
Ion Torrent
LS 454
Oxford Nanopore
PacBio SMRT
Actual Values
platform
count
Illumina
8503
Not Reported
242
Other
11
The issue is with the last two platforms. We need to decide what platform these experiments were performed on.
The 11 experiments where platform is other are all rna-seq samples, where the instrument model is DNBSeq that were sequenced at BGI.
@chris-s-friedman to get the platform for the above from bix
For the 242, their compostion of strategy, instrument model, and sequencing center is below. Note that none of these experiments have a value for instrument model.
library_strategy
instrument_model
sequencing_center_id
sequencing center name
count
RNA-Seq
Not Reported
SC_2ZBAMKK0
Novogene
81
WGS
Not Reported
SC_2ZBAMKK0
Novogene
131
WGS
Not Reported
SC_FAD4KCQG
BGI
15
WGS
Not Reported
SC_N1EVHSME
NantOmics
10
WGS
Not Reported
SC_WWEQ9HFY
BGI@CHOP Genome Center
5
@chris-s-friedman to look through past files to get previously investigated platform
Instrument Model
Actual Values
Instrument Model
Count
Not Reported
5838
HiSeq
1809
HiSeq X
1007
Novaseq 6000
91
DNBSeq
11
None of these instrument models are accepted values in their data model
Neither HiSeq or HiSeq X are accepted values, but they do have values for HiSeq X Five and HiSeq X Ten.
There is no Novaseq instrument model in their enumerated values.
There is no DNBSeq instrument model in their enumerated values.
@baileyckelly to ask ccdi if these values above are acceptable
Of the Not Reported instrument models:
199 experiments are cbtn experiments from pre-x01
76 experiments are pnoc 003/008 experiments created before february 2023
5449 experiments are from cbtn x01
40 experiments are pnoc 003/008 experiments on 2/6/2023 and 2/8/2023 that look to be associated with cbtn x01
74 experiments are associated with cbtn x01 under the study ID SD_8C478S85, High Incidence of Pediatric CNS Tumors, D3B-PCNST.
Items 1 and 2 will need some further investigation.
3, 4, and 5 are all from the cbtn x01 and should all have similiar instrument models.
Library Selection
For RNA-Seq samples, this is missing for all pre-x01 data
For WGX, WXS, and Targeted Capture, this is missing for pre-x01 data and x01 data
From the metadata template:
For sequencing files, please try to provide all metadata, if applicable, for the following properties: avg_read_length, number_of_reads, number_of_bp, coverage
Describe the bug
Sequencing Information is incomplete for some files to be submitted in the second CDS release.
There are 8756 unique sequencing experiments associated with files being submitted.
The export from the dataservice with information about each of these experiemnts is here.
Platform
Accepted values:
AB Capillary ABI Solid BGISEQ Complete Genomics Helicos Illumina Ion Torrent LS 454 Oxford Nanopore PacBio SMRT
Actual Values
The issue is with the last two platforms. We need to decide what platform these experiments were performed on.
The 11 experiments where platform is
other
are all rna-seq samples, where the instrument model isDNBSeq
that were sequenced atBGI
.@chris-s-friedman to get the platform for the above from bix
For the 242, their compostion of strategy, instrument model, and sequencing center is below. Note that none of these experiments have a value for instrument model.
@chris-s-friedman to look through past files to get previously investigated platform
Instrument Model
Actual Values
None of these instrument models are accepted values in their data model
Neither
HiSeq
orHiSeq X
are accepted values, but they do have values forHiSeq X Five
andHiSeq X Ten
.There is no Novaseq instrument model in their enumerated values.
There is no DNBSeq instrument model in their enumerated values.
@baileyckelly to ask ccdi if these values above are acceptable
Of the
Not Reported
instrument models:SD_8C478S85
,High Incidence of Pediatric CNS Tumors
,D3B-PCNST
.Items 1 and 2 will need some further investigation.
3, 4, and 5 are all from the cbtn x01 and should all have similiar instrument models.
Library Selection
For RNA-Seq samples, this is missing for all pre-x01 data For WGX, WXS, and Targeted Capture, this is missing for pre-x01 data and x01 data
From the metadata template:
Number of Reads
missing for 3192 experiments. All pre x01
Mean read length
missing for 3192 experiments. All pre x01
Coverage
Missing for all experiments
number of bp
missing for all experiments
Expected behavior
No response
Version ID
None
Effected file(s)